Tree


Lingfa Yang

  1. Concepts: tree, node, leaf, root, path, depth, size, walk.
  2. TokenInfo
  3. TokenNode
  4. The first code sample
  5. The second code sample
  6. Construction/Destruction Methods
  7. Query methods: parent(); name(); empty(); size(); child(); firstChild(); lastChild(); nextSibling(); previousSibling(); node(); nodes(); contains(); ==(); !=()
  8. Manipulation Methods: subtract(); del(); prepend(); rotate(); chop(); adopt();
  9. Sort
  10. Save
  11. Walk

To Top
  1. Concepts

    Tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes. It is an acyclic connected graph where each node has a set of zero or more children nodes, and at most one parent node
    ...

    A node contains:

    • one parent node,
    • 0 or 1, 2, ... child nodes, and
    • info - information about itself, such as its name, its attributes, its type ...

    A leaf is a node which has no child.

    A tree has one root, which is a node who has no parent.

    A path (root path) is a path to the root.

    A depth is a length of the path to its root (root path).

    A size of a node means number of children a node has.

    A walk of the tree (tree traversal) means visit every nodes a tree has.

  2. To Top

    TokenInfo

    StreamReader reads token one-by-one. A token can be one of the three types a start element, an end element or characters. A start element contains a name, and attributes; an element has a name only; while characters token contains a text only.
    struct TokenInfo 
    {
      enum TokenType type;
      fstring prefix;   // prefix of a tag name
      fstring name;     // local name
      fwstring text;    // element content
      PairList attrs; 
      ...
    };
    
    To Top
  3. TokenNode

    class TokenNode
    {
    public:
      TokenInfo info;
      std::vector <TokenNode *>  children;
    protected:
      TokenNode *m_parent;
    };
    
    To Top
  4. The first code sample:

    Create a tree from scratch, and save to a file.
    bool sample1()
    {
      // create a tree
      TokenNode *p = new TokenNode("p");        // root
      TokenNode *r = new TokenNode("r", p);     // a child
      TokenNode *rPr = new TokenNode("rPr", r);
      TokenNode *t = new TokenNode("t", r);
      new TokenTextNode(L"some text", t);       // a text node (leaf)
      
      // save
      StreamFileWriter fw("c:/1.xml");
      p->save(&fw, 4);
    
      // release
      delete p;
      return true;
    }
    

    The outout file:

    <p>
        <r>
            <rPr/>
            <t>some text</t>
        </r>
    </p>
    

    The tree view:
    To Top

  5. The second code sample:

    Create a tree during stream reading. Retrieve a specific node of the given path, and add an attribute to the node.
    
    bool sample2()
    {
      StreamReader reader;
      if (!reader.readFile("c:/1.xml")) return false;
    
      while (!reader.atEnd()) {
        reader.readNext();
        if (reader.isStartElement()) {
          if (reader.name() == "p") {
            TokenNode *p = reader.tree();        // create a tree
            TokenNode *rPr = p->node("p/r/rPr"); // retrieve a node
            if (rPr) {
              rPr->info.attrs << StrPair("b", "1"); // add an attribute
            }
            delete p; // release
          }
        }
      }
      return true;
    }
    

    Now, the tree is:
    To Top

  6. Construction/Destruction Methods

    1. Construction

      Copy constructor:
        TokenNode p1(*p);
      
      The same as:
        TokenNode p1 = *p;
      
      or if you like pointer,
        TokenNode *p1; 
        *p1 = *p;
        delete p1;
      
      The copy, including its children recursive copy,
      TokenNode & TokenNode::operator = (const TokenNode & that)
      {
        clear();
        this->m_parent = that.m_parent;
        this->info = that.info;
        for(size_t i = 0; i < that.children.size(); ++ i) {
          this->children.push_back(new TokenNode(*that.children[i]));
        }
        return *this;
      }
      
    2. Destruction

      clear() means remove all of its children.
      TokenNode::~TokenNode()
      {
        clear();
      }
      void TokenNode::clear()
      {
        std::vector <TokenNode *>::iterator i, 
          b = children.begin(), 
          e = children.end();
        for (i = b; i != e; ++ i) {
          delete *i;
        }
        children.clear();
      }
      
  7. To Top

    Query methods

    1. Simple Query methods about its parent, its name, its size.

        TokenNode *parent() const {return m_parent;}
        const fstring & name() const {return info.name;}
        bool empty() const {return children.empty();}
        size_t size() const {return children.size();}
      
    2. Query a child by name, by index ...

        TokenNode *child(const fstring & name) const;
        TokenNode *child(size_t index) const;
        TokenNode *firstChild() const;
        TokenNode *lastChild() const;
      
    3. Query a sibling

        TokenNode *nextSibling() const;
        TokenNode *previousSibling() const;
      
    4. Query a single node of a specific path.

        TokenNode *node(const fstring &path);
      
      For example,
        TokenNode *rPr = p->node("p/r/rPr");
      
    5. Collect all nodes including itself, all children, and grandchildren and so on, recursively.

      void TokenNode::nodes(std::vector<TokenNode *> & v)
      {
        v.push_back(this);
        for (size_t i = 0; i < size(); ++ i) {
          TokenNode *child = children[i];
          child->nodes(v); // recursion
        };
      }
      
    6. Flatten a tree structure into an one-dimensional array.

      (see above)
    7. Collect all nodes of the same name.

      void TokenNode::nodes(std::vector<TokenNode *> & v, const fstring & name)
      {
        if (info.name == name) v.push_back(this);
        for (size_t i = 0; i < size(); ++ i) {
          TokenNode *child = children[i];
          child->nodes(v, name); // recursion
        };
      }
      
    8. Direct children contains

      
      /*!
        Return true if the given child name matches one of its children. 
      */
      bool TokenNode::contains(const fstring & childName) const
      {
        for(size_t i = 0; i < children.size(); ++ i) {
          if (children[i]->info.name == childName) return true;
        }
        return false;
      }
      

      If a child node is passed, contains() will call operator ==
      bool TokenNode::contains(const TokenNode * child) const
      {
        for(size_t i = 0; i < children.size(); ++ i) {
          if (*children[i] == *child) return true;
        }
        return false; 
      }
      

    9. Operator qual and not equal methods

      
      /*!
        Define a tree equal as equal content and equal belongs.
        Return false as early as none-equal detected.
        Return true when all nodes and leaves get walked (traversal).
      */
      bool TokenNode::operator == (const TokenNode & that) const
      {
        if (this->info != that.info) return false;
      
        if (this->.size() != that.size()) return false;
      
        std::vector <TokenNode *>::const_iterator i
          , b = this->children.begin(), e = this->children.end();
      
        std::vector <TokenNode *>::const_iterator j
          , c = that.children.begin(), f = that.children.end();
      
        for (i = b, j = c; i != e; ++ i, ++ j) { 
          if (*(*i) != *(*j)) { // recursion
            return false;
          }
        }
        return true;
      }
      
      bool operator != (const TokenNode & that) const 
      {return !(*this == that);}
      
  8. To Top

    Manipulation Methods:

    1. Remove same child(ren) own by the other node

      void TokenNode::subtract(const TokenNode * that)
      {
        bool count = 0;
        for(size_t i = 0; i < children.size(); ++ i) {
          if (that->contains(children[i])) {
            delete children[i];
            children[i] = NULL;
            ++ count;
          }
        }
      
        // shrink
        if (count) {
          std::vector <TokenNode *> tmp;
          for(size_t i = 0; i < children.size(); ++ i) {
            if (!children[i]) continue;
            tmp.push_back(children[i]);
          }
          children = tmp;
        }
      }
      
    2. Delete a child of the given name.

      bool TokenNode::del(const fstring & childName)
      {
        for(size_t i = 0; i < children.size(); ++ i) {
          if (children[i]->info.name == childName) {
            delete children[i];
            while (i != children.size() - 1) {
              children[i] = children[i+1]; // shift one step
              ++ i;
            }
            children.pop_back(); // shrink one
            return true;
          }
        }
        return false;
      }
      
    3. prepend on child

      void TokenNode::prepend(TokenNode * child)
      {
        children.push_back(child);
        this->rotate();// clockwise
      }
      
    4. Rotate children.

      Last one turns to be the first if rotate clockwise.
      void TokenNode::rotate(bool conterclockwise)
      {
        size_t sz = children.size();
        if (sz < 2) return;
        if (conterclockwise) {
          TokenNode *first = children[0];
          for (size_t i = 0; i < size()-1; ++ i) {
            children[i] = children[i+1];
          }
          children[sz-1] = first;
        }
        else {
          TokenNode *last = children[sz-1];
          for (size_t i = 0; i < size()-1; ++ i) {
            children[sz-i-1] = children[sz-i-2];
          }
          children[0] = last;
        }
      }
      
    5. Chop off the last child

      bool TokenNode::chop()
      {
        if (children.empty()) return false;
        
        size_t sz = children.size();
        TokenNode *last = children[sz-1];
        delete last;
        children.pop_back();
        return true;
      }
      
    6. Adopt/accept/graft all children from another tree.

      bool TokenNode::adopt(TokenNode *& tree)
      {
        if (!tree) return false;
        for (size_t i = 0; i < tree->size(); ++ i) {
          this->children.push_back(tree->children[i]);    
        }
        tree->children.clear();
        return true;
      }
      

  9. To Top

    Sort

    void TokenNode::sort() 
    {
      std::sort(children.begin(), children.end(), TokenNode());
    
      for (size_t i = 0; i < children.size(); ++ i) {
        children[i]->sort();
      }
    }
    bool TokenNode::operator () (const TokenNode *a, 
                                 const TokenNode *b) const 
    {
      return a->info < b->info;
    }
    
  10. To Top

    Save

    /*!
    Writes out to the output stream an XML representation of this node and all of its children.
    This function format the XML using indentation.
    0) The original indentation is removed, and new indent may inserted based on setting.
    1) "0" means no line break no space;
    2) a none-zero integer means number of spaces put after a line break.
    3) Use indentation for given none-zero indent of all open tag
    4) Use indentation for given none-zero indent of all close tag except:
    a) Element with content presented as: <t>some text</t>
    b) Empty element presented as <t/>, not <t></t>
    */
    void TokenNode::save(StreamWriter *os, size_t indent, size_t blockShift)
    {
      m_missCloseAngle = false;
      int32 depth = 0;
      save(os, indent, this, depth, blockShift);
    }
    
  11. Walk

    Here list two ways of walking a tree.
    1. call next() to walk through a tree.
    Here is the method:
    TokenNode *TokenNode::next() 
    {
      if (!empty()) return firstChild();
      else {
        TokenNode *node = this;
        do {
          if (!node->parent()) return NULL; // the end
          TokenNode *sibling = node->nextSibling();
          if (sibling) return sibling;
          else node = node->parent();
        } while (1);
      }
    }
    
    Here is a test case:
      TokenNode *node = root; // point to the root
      while (node) {
        node = node->next();
      }
    
    Here is an example:
    
    bool tree_walk_test()
    {
      // Create a tree by reading an XML file
      fstring fileName("federal.xml");
      StreamReader reader;
      if (!reader.readFile(fileName)) return false;
    
      TokenNode *federal = reader.tree("federal");
    
      // tree walk to collect names
      fstring s;
      TokenNode *node = federal; // initialize
      while (node) {
        s += node->name() + ", ";
        node = node->next(); // step to next node
      }
    
      // result check
      if (s != "federal, state, town, Newton, Concord, city, Waltham, ") return false;
    
      // memory release
      if (federal) {
        delete federal;
        federal = NULL;
      }
      return true;
    }
    
    2. collect all nodes in a vector, then iterate through.
    Here is a test case:
      std::vector  all;
      root->nodes(all); // collect all nodes
      
      // iterate in STL manner
      std::vector <TokenNode *>::const_iterator i,
        b = all.begin(),
        e = all.end();
      for (i = b; i != e; ++ i) {
        list << (*i)->name();
      }
    
      // iterate in array manner
      for (size_t i = 0; i < all.size(); ++ i) {
        list << all[i]->name();
      }
    

XmlStreamReader | Tree