|
|
unsigned |
|
ApplyDelset (NxsUnsignedSet &delset) |
|
|
|
|
Deletes (i.e., excludes from further analyses) taxa whose indices are contained in the set delset. The taxon indices refer to original taxon indices, not current indices (originals will equal current ones if number of taxa in TAXA block equals number of taxa in MATRIX command). Returns the number of taxa actually deleted (some may have already been deleted) |
|
|
|
unsigned |
|
ApplyExset (NxsUnsignedSet &exset) |
|
|
|
|
Excludes characters whose indices are contained in the set exset. The indices supplied should refer to the original character indices, not current character indices. Returns number of characters actually excluded (some may have already been excluded). |
|
|
|
unsigned |
|
ApplyIncludeset (NxsUnsignedSet &inset) |
|
|
|
|
Includes characters whose indices are contained in the set inset. The indices supplied should refer to the original character indices, not current character indices. |
|
|
|
unsigned |
|
ApplyRestoreset (NxsUnsignedSet &restoreset) |
|
|
|
|
Restores (i.e., includes in further analyses) taxa whose indices are contained in the set restoreset. The taxon indices refer to original taxon indices, not current indices (originals will equal current ones if number of taxa in TAXA block equals number of taxa in MATRIX command). |
|
|
|
void |
|
BuildCharPosArray (bool check_eliminated) |
|
|
|
|
Use to allocate memory for (and initialize) charPos array, which keeps track of the original character index in cases where characters have been eliminated. This function is called by HandleEliminate in response to encountering an ELIMINATE command in the data file, and this is probably the only place where BuildCharPosArray should be called with check_eliminated true. BuildCharPosArray is also called in HandleMatrix, HandleCharstatelabels, HandleStatelabels, and HandleCharlabels. |
|
V |
|
unsigned |
|
CharLabelToNumber (NxsString s) |
|
|
|
|
Converts a character label to a 1-offset number corresponding to the character's position within charLabels. This method overrides the virtual function of the same name in the NxsBlock base class. If s is not a valid character label, returns the value 0. |
|
|
|
void |
|
Consume (NxsCharactersBlock &other) |
|
|
|
|
Transfers all data from other to this object, leaving other completely empty. Used to convert a NxsDataBlock object to a NxsCharactersBlock object in programs where it is desirable to just have a NxsCharactersBlock for storage but also allow users to enter the information in the form of the deprecated NxsDataBlock. This function does not make a copy of such things as the data matrix, instead just transferring the pointer to that object from other to this. This is whay it was named Consume rather than CopyFrom. |
|
V |
|
void |
|
DebugShowMatrix (ostream &out, bool use_matchchar, char *marginText) |
|
|
|
|
Provides a dump of the contents of the matrix variable. Useful for testing whether data is being read as expected. If marginText is NULL, matrix output is placed flush left. If each line of output should be prefaced with a tab character, specify " " for marginText. |
|
I |
|
void |
|
DeleteTaxon (unsigned i) |
|
|
|
|
Deletes taxon whose 0-offset current index is i. If taxon has already been deleted, this function has no effect. |
|
I |
|
void |
|
ExcludeCharacter (unsigned i) |
|
|
|
|
Excludes character whose 0-offset current index is i. If character has already been excluded, this function has no effect. |
|
I |
|
bool |
|
*GetActiveCharArray () |
|
|
|
|
Returns activeChar data member (pointer to first element of the activeChar array). Access to this protected data member is necessary in certain circumstances, such as when a NxsCharactersBlock object is stored in another class, and that other class needs direct access to the activeChar array even though it is not derived from NxsCharactersBlock. |
|
I |
|
bool |
|
*GetActiveTaxonArray () |
|
|
|
|
Returns activeTaxon data member (pointer to first element of the activeTaxon array). Access to this protected data member is necessary in certain circumstances, such as when a NxsCharactersBlock object is stored in another class, and that other class needs direct access to the activeTaxon array even though it is not derived from NxsCharactersBlock. |
|
I |
|
NxsString |
|
GetCharLabel (unsigned i) |
|
|
|
|
Returns label for character i, if a label has been specified. If no label was specified, returns string containing a single blank (i.e., " "). |
|
I |
|
unsigned |
|
GetCharPos (unsigned origCharIndex) |
|
|
|
|
Returns current index of character in matrix. This may differ from the original index if some characters were removed using an ELIMINATE command. For example, character number 9 in the original data matrix may now be at position 8 if the original character 8 was eliminated. The parameter origCharIndex is assumed to range from 0 to ncharTotal - 1. |
|
|
I |
|
char |
|
GetGapSymbol () |
|
|
|
|
Returns the gap symbol currently in effect. If no gap symbol specified, returns ' '. |
|
I |
|
int |
|
GetInternalRepresentation (unsigned i, unsigned j, unsigned k) |
|
|
|
|
Returns internal representation of the state for taxon i, character j. In the normal situation, k is 0 meaning there is only one state with no uncertainty or polymorphism. If there are multiple states, specify a number in the range [0..n) where n is the number of states returned by the GetNumStates function. Use the IsPolymorphic function to determine whether the multiple states correspond to uncertainty in state assignment or polymorphism in the taxon. The value returned from this function is one of the following: - -3 means gap state (see note below)
- -2 means missing state (see note below)
- an integer 0 or greater is internal representation of a state
Note: gap and missing states are actually represented internally in a different way; for a description of the actual internal representation of states, see the documentation for NxsDiscreteDatum. |
|
I |
|
char |
|
GetMatchcharSymbol () |
|
|
|
|
Returns the matchchar symbol currently in effect. If no matchchar symbol specified, returns ' '. |
|
V |
|
unsigned |
|
GetMaxObsNumStates () |
|
|
|
|
Returns the maximum observed number of states for any character. Note: this function is rather slow, as it must walk through each row of each column, adding the states encountered to a set, then finally returning the size of the set. Thus, if this function is called often, it would be advisable to initialize an array using this function, then refer to the array subsequently. |
|
I |
|
char |
|
GetMissingSymbol () |
|
|
|
|
Returns the missing data symbol currently in effect. If no missing data symbol specified, returns ' '. |
|
I |
|
unsigned |
|
GetNChar () |
|
|
|
|
Returns the value of nchar. |
|
|
I |
|
unsigned |
|
GetNTax () |
|
|
|
|
Returns the value of ntax. |
|
|
|
|
unsigned |
|
GetNumActiveChar () |
|
|
|
|
Performs a count of the number of characters for which activeChar array reports true. |
|
|
|
unsigned |
|
GetNumActiveTaxa () |
|
|
|
|
Performs a count of the number of taxa for which activeTaxon array reports true. |
|
I |
|
unsigned |
|
GetNumEliminated () |
|
|
|
|
Returns the number of characters eliminated with the ELIMINATE command. |
|
I |
|
unsigned |
|
GetNumEquates () |
|
|
|
|
Returns the number of stored equate associations. |
|
I |
|
unsigned |
|
GetNumMatrixCols () |
|
|
|
|
Returns the number of actual columns in matrix. This number is equal to nchar, but can be smaller than ncharTotal since the user could have eliminated some of the characters. |
|
I |
|
unsigned |
|
GetNumMatrixRows () |
|
|
|
|
Returns the number of actual rows in matrix. This number is equal to ntax, but can be smaller than ntaxTotal since the user did not have to provide data for all taxa specified in the TAXA block. |
|
I |
|
unsigned |
|
GetNumStates (unsigned i, unsigned j) |
|
|
|
|
Returns the number of states for taxon i, character j. |
|
IV |
|
unsigned |
|
GetObsNumStates (unsigned j) |
|
|
|
|
Returns the number of states for character j over all taxa. Note: this function is rather slow, as it must walk through each row, adding the states encountered to a set, then finally returning the size of the set. Thus, if this function is called often, it would be advisable to initialize an array using this function, then refer to the array subsequently. |
|
|
|
unsigned |
|
GetOrigCharIndex (unsigned j) |
|
|
|
|
Returns the original character index in the range [0..ncharTotal). Will be equal to j unless some characters were eliminated. |
|
I |
|
unsigned |
|
GetOrigCharNumber (unsigned j) |
|
|
|
|
Returns the original character number (used in the NEXUS data file) in the range [1..ncharTotal]. Will be equal to j + 1 unless some characters were eliminated. |
|
|
|
unsigned |
|
GetOrigTaxonIndex (unsigned i) |
|
|
|
|
Returns the original taxon index in the range [0..ntaxTotal). Will be equal to i unless data was not provided for some taxa listed in a preceding TAXA block. |
|
I |
|
unsigned |
|
GetOrigTaxonNumber (unsigned i) |
|
|
|
|
Returns the original taxon number (used in the NEXUS data file) in the range [1..ntaxTotal]. Will be equal to i + 1 unless data was not provided for some taxa listed in a preceding TAXA block. |
|
I |
|
char |
|
GetState (unsigned i, unsigned j, unsigned k) |
|
|
|
|
Returns symbol from symbols list representing the state for taxon i and character j. The normal situation in which there is only one state with no uncertainty or polymorphism is represented by k = 0. If there are multiple states, specify a number in the range [0..n) where n is the number of states returned by the GetNumStates function. Use the IsPolymorphic function to determine whether the multiple states correspond to uncertainty in state assignment or polymorphism in the taxon. Assumes symbols is non-NULL. |
|
|
|
NxsString |
|
GetStateLabel (unsigned i, unsigned j) |
|
|
|
|
Returns label for character state j at character i, if a label has been specified. If no label was specified, returns string containing a single blank (i.e., " "). |
|
I |
|
char |
|
*GetSymbols () |
|
|
|
|
Returns data member symbols. Warning: returned value may be NULL. |
|
|
I |
|
unsigned |
|
GetTaxPos (unsigned origTaxonIndex) |
|
|
|
|
Returns current index of taxon in matrix. This may differ from the original index if some taxa were listed in the TAXA block but not in the DATA or CHARACTERS block. The parameter origTaxonIndex is assumed to range from 0 to ntaxTotal - 1. |
|
|
|
void |
|
HandleCharlabels (NxsToken &token) |
|
|
|
|
Called when CHARLABELS command needs to be parsed from within the DIMENSIONS block. Deals with everything after the token CHARLABELS up to and including the semicolon that terminates the CHARLABELS command. If an ELIMINATE command has been processed, labels for eliminated characters will not be stored. |
|
|
|
void |
|
HandleCharstatelabels (NxsToken &token) |
|
|
|
|
Called when CHARSTATELABELS command needs to be parsed from within the CHARACTERS block. Deals with everything after the token CHARSTATELABELS up to and including the semicolon that terminates the CHARSTATELABELS command. Resulting charLabels vector will store labels only for characters that have not been eliminated, and likewise for charStates. Specifically, `charStates[0]' refers to the vector of character state labels for the first non-eliminated character. |
|
|
|
void |
|
HandleDimensions (NxsToken &token, NxsString newtaxaLabel, NxsString ntaxLabel, NxsString ncharLabel) |
|
|
|
|
Called when DIMENSIONS command needs to be parsed from within the CHARACTERS block. Deals with everything after the token DIMENSIONS up to and including the semicolon that terminates the DIMENSIONs command. newtaxaLabel, ntaxLabel and ncharLabel are simply "NEWTAXA", "NTAX" and "NCHAR" for this class, but may be different for derived classes that use newtaxa, ntax and nchar for other things (e.g., ntax is number of populations in an ALLELES block) |
|
|
|
void |
|
HandleEliminate (NxsToken &token) |
|
|
|
|
Called when ELIMINATE command needs to be parsed from within the CHARACTERS block. Deals with everything after the token ELIMINATE up to and including the semicolon that terminates the ELIMINATE command. Any character numbers or ranges of character numbers specified are stored in the NxsUnsignedSet eliminated, which remains empty until an ELIMINATE command is processed. Note that like all sets the character ranges are adjusted so that their offset is 0. For example, given "eliminate 4-7;" in the data file, the eliminate array would contain the values 3, 4, 5 and 6 (not 4, 5, 6 and 7). It is assumed that the ELIMINATE command comes before character labels and/or character state labels have been specified; an error message is generated if the user attempts to use ELIMINATE after a CHARLABELS, CHARSTATELABELS, or STATELABELS command. |
|
|
|
void |
|
HandleEndblock (NxsToken &token, NxsString charToken) |
|
|
|
|
Called when the END or ENDBLOCK command needs to be parsed from within the CHARACTERS block. Does two things: o checks to make sure the next token in the data file is a semicolon o eliminates character labels and character state labels for characters that have been eliminated |
|
V |
|
void |
|
HandleFormat (NxsToken &token) |
|
|
|
|
Called when FORMAT command needs to be parsed from within the DIMENSIONS block. Deals with everything after the token FORMAT up to and including the semicolon that terminates the FORMAT command. |
|
V |
|
void |
|
HandleMatrix (NxsToken &token) |
|
|
|
|
Called when MATRIX command needs to be parsed from within the CHARACTERS block. Deals with everything after the token MATRIX up to and including the semicolon that terminates the MATRIX command. |
|
V |
|
bool |
|
HandleNextState (NxsToken &token, unsigned i, unsigned j) |
|
|
|
|
Called from HandleStdMatrix or HandleTransposedMatrix function to read in the next state. Always returns true except in the special case of an interleaved matrix, in which case it returns false if a newline character is encountered before the next token. |
|
|
|
void |
|
HandleStatelabels (NxsToken &token) |
|
|
|
|
Called when STATELABELS command needs to be parsed from within the DIMENSIONS block. Deals with everything after the token STATELABELS up to and including the semicolon that terminates the STATELABELS command. Note that the numbers of states are shifted back one before being stored so that the character numbers in the NxsStringVectorMap objects are 0-offset rather than being 1-offset as in the NxsReader data file. |
|
V |
|
void |
|
HandleStdMatrix (NxsToken &token) |
|
|
|
|
Called from HandleMatrix function to read in a standard (i.e., non-transposed) matrix. Interleaving, if applicable, is dealt with herein. |
|
|
|
void |
|
HandleTaxlabels (NxsToken &token) |
|
|
|
|
Called when TAXLABELS command needs to be parsed from within the CHARACTERS block. Deals with everything after the token TAXLABELS up to and including the semicolon that terminates the TAXLABELS command. |
|
V |
|
unsigned |
|
HandleTokenState (NxsToken &token, unsigned j) |
|
|
|
|
Called from HandleNextState to read in the next state when TOKENS was specified. Looks up state in character states listed for the character to make sure it is a valid state, and returns state's value (0, 1, 2, ...). Note: does NOT handle adding the state's value to matrix. Save the return value (call it k) and use the following command to add it to matrix: matrix->AddState(i, j, k); |
|
V |
|
void |
|
HandleTransposedMatrix (NxsToken &token) |
|
|
|
|
Called from HandleMatrix function to read in a transposed matrix. Interleaving, if applicable, is dealt with herein. |
|
I |
|
void |
|
IncludeCharacter (unsigned i) |
|
|
|
|
Includes character whose 0-offset current index is i. If character is already active, this function has no effect. |
|
I |
|
bool |
|
IsActiveChar (unsigned j) |
|
|
|
|
Returns true if character j is active. If character j has been excluded, returns false. Assumes j is in the range [0..nchar). |
|
I |
|
bool |
|
IsActiveTaxon (unsigned i) |
|
|
|
|
Returns true if taxon i is active. If taxon i has been deleted, returns false. Assumes i is in the range [0..ntax). |
|
I |
|
bool |
|
IsDeleted (unsigned i) |
|
|
|
|
Returns true if taxon number i has been deleted, false otherwise. |
|
|
|
bool |
|
IsEliminated (unsigned origCharIndex) |
|
|
|
|
Returns true if character number origCharIndex was eliminated, false otherwise. Returns false immediately if eliminated set is empty. |
|
I |
|
bool |
|
IsExcluded (unsigned j) |
|
|
|
|
Returns true if character j has been excluded. If character j is active, returns false. Assumes j is in the range [0..nchar). |
|
I |
|
bool |
|
IsGapState (unsigned i, unsigned j) |
|
|
|
|
Returns true if the state at taxon i, character j is the gap state, false otherwise. Assumes matrix is non-NULL. |
|
|
|
bool |
|
IsInSymbols (char ch) |
|
|
|
|
Returns true if ch can be found in the symbols array. The value of respectingCase is used to determine whether or not the search should be case sensitive. Assumes symbols is non-NULL. |
|
I |
|
bool |
|
IsInterleave () |
|
|
|
|
Returns true if INTERLEAVE was specified in the FORMAT command, false otherwise. |
|
I |
|
bool |
|
IsLabels () |
|
|
|
|
Returns true if LABELS was specified in the FORMAT command, false otherwise. |
|
I |
|
bool |
|
IsMissingState (unsigned i, unsigned j) |
|
|
|
|
Returns true if the state at taxon i, character j is the missing state, false otherwise. Assumes matrix is non-NULL. |
|
I |
|
bool |
|
IsPolymorphic (unsigned i, unsigned j) |
|
|
|
|
Returns true if taxon i is polymorphic for character j, false otherwise. Assumes matrix is non-NULL. Note that return value will be false if there is only one state (i.e., one cannot tell whether there is uncertainty using this function). |
|
I |
|
bool |
|
IsRespectCase () |
|
|
|
|
Returns true if RESPECTCASE was specified in the FORMAT command, false otherwise. |
|
I |
|
bool |
|
IsTokens () |
|
|
|
|
Returns true if TOKENS was specified in the FORMAT command, false otherwise. |
|
I |
|
bool |
|
IsTranspose () |
|
|
|
|
Returns true if TRANSPOSE was specified in the FORMAT command, false otherwise. |
|
C |
|
|
|
NxsCharactersBlock (NxsTaxaBlock *tb, NxsAssumptionsBlock *ab) |
|
|
|
|
Initializes id to "CHARACTERS", taxa to tb, assumptionsBlock to ab, ntax, ntaxTotal, nchar and ncharTotal to 0, newchar to true, newtaxa, interleaving, transposing, respectingCase, tokens and formerly_datablock to false, datatype to `NxsCharactersBlock::standard', missing to '?', gap and matchchar to ' ', and matrix, charPos, taxonPos, activeTaxon, and activeChar to NULL. The ResetSymbols member function is called to reset the symbols data member. Assumes that tb and ab point to valid NxsTaxaBlock and NxsAssumptionsBlock objects, respectively. |
|
D |
|
|
|
~NxsCharactersBlock () |
|
|
|
|
Deletes any memory allocated to the arrays symbols, charPos, taxonPos, activeChar, and activeTaxon. Flushes the containers charLabels, eliminated, and deleted. Also deletes memory allocated to matrix. |
|
|
|
unsigned |
|
PositionInSymbols (char ch) |
|
|
|
|
Returns position of ch in symbols array. The value of respectingCase is used to determine whether the search should be case sensitive or not. Assumes symbols is non-NULL. Returns UINT_MAX if ch is not found in symbols. |
|
V |
|
void |
|
Read (NxsToken &token) |
|
|
|
|
This function provides the ability to read everything following the block name (which is read by the NxsReader object) to the END or ENDBLOCK statement. Characters are read from the input stream in. Overrides the abstract virtual function in the base class. |
|
V |
|
void |
|
Report (ostream &out) |
|
|
|
|
This function outputs a brief report of the contents of this CHARACTERS block. Overrides the abstract virtual function in the base class. |
|
V |
|
void |
|
Reset () |
|
|
|
|
Returns NxsCharactersBlock object to the state it was in when first created. |
|
|
|
void |
|
ResetSymbols () |
|
|
|
|
Resets standard symbol set after a change in datatype is made. Also flushes equates list and installs standard equate macros for the current datatype. |
|
I |
|
void |
|
RestoreTaxon (unsigned i) |
|
|
|
|
Restores taxon whose 0-offset current index is i. If taxon is already active, this function has no effect. |
|
|
|
void |
|
ShowStateLabels (ostream &out, unsigned i, unsigned j, unsigned first_taxon) |
|
|
|
|
Looks up the state(s) at row i, column j of matrix and writes it (or them) to out. If there is uncertainty or polymorphism, the list of states is surrounded by the appropriate set of symbols (i.e., parentheses for polymorphism, curly brackets for uncertainty). If TOKENS was specified, the output takes the form of the defined state labels; otherwise, the correct symbol is looked up in symbols and output. |
|
I |
|
void |
|
ShowStates (ostream &out, unsigned i, unsigned j) |
|
|
|
|
Shows the states for taxon i, character j, on the stream out. Uses symbols array to translate the states from the way they are stored (as integers) to the symbol used in the original data matrix. Assumes i is in the range [0..ntax) and j is in the range [0..nchar). Also assumes matrix is non-NULL. |
|
IV |
|
unsigned |
|
TaxonLabelToNumber (NxsString s) |
|
|
|
|
Converts a taxon label to a number corresponding to the taxon's position within the list maintained by the NxsTaxaBlock object. This method overrides the virtual function of the same name in the NxsBlock base class. If s is not a valid taxon label, returns the value 0. |
|
|
|
void |
|
WriteStates (NxsDiscreteDatum &d, char *s, unsigned slen) |
|
|
|
|
Writes out the state (or states) stored in this NxsDiscreteDatum object to the buffer s using the symbols array to do the necessary translation of the numeric state values to state symbols. In the case of polymorphism or uncertainty, the list of states will be surrounded by brackets or parentheses (respectively). Assumes s is non-NULL and long enough to hold everything printed. |
|