|
A Nested Set Implementation in Java and PostgreSQL |
This tutorial intends to give the reader:
See an alternate implementation that uses varchar fields.
The top of the tree is called the root. The nodes of the tree that have no subtrees beneath them are
called the leaf nodes. The descendants, or children, of a parent node are every node at all lower
levels in the subtree that has the parent node as its root. In the table the data looks like
this:
id parentId ______________ 1 null | 0 2 1 3 1 4 2 5 2 6 3 7 2 8 7This approach is a pretty standard list linking used in Online Transaction Processing type databases. It works pretty well, and doesn't require a lot of integrety checks. Make sure the parentId points to a valid id and you're set. Consider again the filesystem program. It would have the following use-cases.
SELECT * FROM table WHERE parentId = node;And the Linked-List approach works well for use-case -2-, as all you have to do is something like this, again an O(1) statement:
INSERT INTO table ( id, parentId ) VALUES ( newId, NODE );But things just totally fall apart in use case -3- there is no easy or efficient way to implement this using the Linked list. Most solutions for this problem drop into a stored procedure or use a middle tier using some iterative loop or a stack. The farther down the tree you have to go, the longer the statement runs. Most solutions to this problem are unelegent hacks that take too long and often leave the RDBMS integrity in a vulnrable state while its running. One elegent way to do this still using the linked list approach is to use in a recursive trigger like this to handle the parent/child stuff. Its still not really that fast, but it does keep all the logic in the RDBMS instead of in an outside method. For use case -4- you can use:
UPDATE table SET parentId = NODE;which is also very quick and easy. However, you're fucked again for use case -5-. This is a big problem, as most users expect to get search information back quickly. If you are running some sort of EIS system where people are constantly searching through deeply nested set of relational information you will soon be fired. This means we have to find an O(x) or quicker algorithm, or rethink the data schema. As very few DBA's have the luxury of rewriting their schemas on a whim, you have to think up a quick and easy way to improve the performace of the system without messing up your legacy stuff. This is why you want to use the Nested Set Model.
You see that the root, node number 1 is the largest set and contains every other node in the
system. So, how do we represent this in SQL? We do it using a number line where each set is
given a left and right hand bounds.
For example, below you can see the following:
SELECT
n0.nid
FROM
NestedSet n0,
NestedSet n1
WHERE
n1.nid = 0
AND n0.lft BETWEEN n1.lft AND n1.rft;
Table "nestedset" Column | Type | Notes ---------+--------+----------- nid | bigint | Primary Key parentId | bigint | Foreign Key (nestedSet.nid) rft | bigint | Unique Key lft | bigint | Unique Key
// First we need to assure the language is loaded.
CREATE FUNCTION plpgsql_call_handler () RETURNS OPAQUE AS '/usr/local/pgsql/lib/plpgsql' LANGUAGE C;
CREATE TRUSTED LANGUAGE plpgsql HANDLER plpgsql_call_handler;
// Now create the functions you'll need for operation.
// CloseGap is not necessary, but keeps the NestedSet structure clean.
CREATE OR REPLACE FUNCTION closeGap(BIGINT) RETURNS INTEGER AS '
DECLARE r RECORD;
BEGIN
SELECT INTO r * FROM nestedSet WHERE nid = $1;
UPDATE nestedSet
SET
lft = CASE
WHEN lft > r.lft
THEN lft - (r.rft - r.lft 1 )
ELSE lft END,
rft = CASE
WHEN rft > r.lft
THEN rft - (r.rft - r.lft 1 )
ELSE rft END;
RETURN 1;
END;
' LANGUAGE 'plpgsql';
CREATE OR REPLACE FUNCTION deleteSubtree(BIGINT) RETURNS INTEGER AS '
DECLARE r RECORD ;
BEGIN
SELECT INTO r * FROM nestedSet WHERE nid = $1;
DELETE FROM NestedSet
WHERE lft BETWEEN r.lft AND r.rft;
RETURN 1;
END;
' LANGUAGE 'plpgsql';
heres some reasonable unportable code as an example. Thank you joe celko.
/** * * Copyright 2001 by Shawn Deleurme * All rights reserved. * * $Id: NestedSet.java,v 1.28 2002/07/02 16:46:04 sdeleurme Exp $ */ package com.deleurme.facility.rdbms; import java.util.*; import java.sql.*; import java.util.logging.*; import java.util.logging.*; /** * This class encapsulates everything you need to handle a Nested Set * structure. *
* The connection object is utilized as a pass-through ref and should * not be messed with as general-use methods that use this object may * want to roll the entire transaction back. * * * @version 1.0 * */ public class NestedSet /*implements Storable*/ { public static final int ID = 1; public static final int PARENT_ID = 2; public static final int LEFT = 3; public static final int RIGHT = 4; /** Insert a new record into the NestedSet. */ static final String INSERT_NODE = "INSERT INTO NestedSet VALUES ( ?,?,0,0 ) \n" + // Set existing objects left hand bounds " UPDATE NestedSet " + " SET " + " lft = CASE " + " WHEN lft > (SELECT F1.lft FROM NestedSet as F1 WHERE F1.id = ?) " + " THEN lft + 2 " + // Set new objects right hand bounds " WHEN lft = 0 " + " THEN (SELECT F1.lft + 1 FROM NestedSet as F1 WHERE F1.id = ?) " + " ELSE lft " + " END, " + // Set existing objects right hand bounds " rft = CASE " + " WHEN rft > (SELECT F1.lft FROM NestedSet as F1 WHERE F1.id = ?) " + " THEN rft + 2 " + // Set new objects right hand bounds " WHEN lft = 0 " + " THEN (SELECT F2.lft + 2 FROM NestedSet as F2 WHERE F2.id = ?) " + " ELSE rft " + " END "; /** * Private connection. */ private Connection c ; /** * General use constructor inside of entities. */ public NestedSet(Connection c) { this.c = c ; } /** * Constructor for use with the coreConfig startup. */ public NestedSet() { } public void setParent( Datastore datastore ) {} /** * Logger */ Logger logger = Logger.getLogger("com.deleurme"); /** * Simple fast method to fetch a flattened tree. This method * removes the need for fancy stored procs and weird unportable * sql. * * This method simply returns Lists of Longs from the db. * For more general use, just use the NestedContraint. * * * * @return an ArrayList representation of the subtree. * * @param the root node of the tree * * @throws SQLException Thrown if something is wrong * with the connection. */ public List getSubtreeAsMaps( Long root ) throws SQLException { ResultSet results; // I'm big cause resizing me is slow. List list = new ArrayList(64); try { results = c.createStatement().executeQuery( " SELECT e.name, e.entityType " + " FROM NestedSet as n, entityView as e, NestedSet as n2 " + " WHERE " + " n2.nid =" + root.toString() + " AND e.id = n.nid " + " AND n.lft BETWEEN n2.lft AND n2.rft "); ArrayList row = new ArrayList(4); while ( results.next() ) { row.add(new Long(results.getLong(1))); row.add(new Long(results.getLong(2))); row.add(new Long(results.getLong(3))); row.add(new Long(results.getLong(4))); list.add(row.clone()); } } catch (NullPointerException npe) { logger.throwing("NestedSet","getSubtreeAsMaps", npe ); } catch (Exception e) { // throw (e) ; logger.throwing("NestedSet","getSubtreeAsMaps", e ); } finally { return list; } } /** * Retrieves a list of nodes beneath a tree. I strongly recommend. * method should only be invoked in some sort of serialized transactional state. * * * @param root the target node to delete from. * @return A list of maps corresponding to the columns returned. * @throws SQLException Thrown if something is wrong with the connection. */ public List getSubtree( Long root ) throws SQLException { ResultSet results; // I'm big cause resizing me is slow. List list = new ArrayList(64); try { results = c.createStatement().executeQuery( " SELECT * " + " FROM nestedSet as n, nestedSet as n2 " + " WHERE " + " n2.nid =" + root.toString() + " AND n.lft BETWEEN n2.lft AND n2.rft "); // ArrayList row = new ArrayList(4); HashMap row = new HashMap (4); while ( results.next() ) { row.clear(); // don't forget to clear results Collections across loop boundaries row.put( "nid", new Long(results.getLong("nid"))); row.put( "parentId", new Long(results.getLong("parentId"))); row.put( "lft", new Long(results.getLong("lft"))); row.put( "rft", new Long(results.getLong("rft"))); list.add(row.clone()); } } catch (NullPointerException npe) { logger.throwing("NestedSet","getSubtree", npe ); } catch (Exception e) { // throw (e) ; logger.throwing("NestedSet","getSubtree", e ); } finally { return list; } } /** * This method removes all the records beneath a node. * * @param root the target node to delete from. * @throws SQLException Thrown if something is wrong with the connection. */ public void deleteSubtree( Long root ) throws SQLException { int rows = c.createStatement().executeUpdate( " DECLARE " + " @dropLft BIGINT, " + " @dropRft BIGINT " + " SELECT " + " @dropLft=lft, " + " @dropRft=rft " + " FROM NestedSet " + " WHERE nid = " + root.toString() + " DELETE FROM NestedSet " + " WHERE lft BETWEEN @dropLft AND @dropRft; " ); /////////////////////////////////////////////// // Close Gaps! /////////////////////////////// /////////////////////////////////////////////// closeGap( root ); } /** * This method moves a node. This is the hardest * thing to do in the nested set model. * * @param the current node. * @param the destination node. * @throws SQLException Thrown if something is wrong with the connection. */ public void moveNode( Long root, Long destination) throws SQLException { //try { // Get the Subtree and save it. List subTree = getSubtree( root ); // Remove the subtree from its original spot. deleteSubtree( root ); // Insert the subtree at the destination node. insertSubtree( destination, subTree ); // } // catch (SQLException se) { // try { // c.rollback() ; // } // catch (SQLException sse) { // throw new RuntimeException ("Nested Set Severe Failure. Could not roll back.", sse) ; // } // logger.info( "\nmoveNode failed - but was rolled back." ); // throw (se); // } } /** * Inserts an entire subtree at a given node. * * * @param the current connection. * @param List a list of maps representing rows. * @throws SQLException Thrown if something is wrong with the connection. */ public void insertSubtree( Long destination, List subTree ) throws SQLException { if (subTree.size() == 0) return; StringBuffer buffer = new StringBuffer(256); // // Update the root object parentId to // the destination objects id. // //((ArrayList) subTree.get(0)).set(1, destination ); Map rootRow = (Map) subTree.get(0); rootRow.put("parentId",destination); for (Iterator i=subTree.iterator();i.hasNext(); ){ Map row = (Map) i.next(); Long id = (Long) row.get("nid"); Long parentId = (Long) row.get("parentId"); buffer.append( "INSERT INTO NestedSet (nid,\"parentId\", lft, rft) " + "VALUES ( " + id + "," + parentId + ",0,0 )\n" ); } buffer.append ( // Set existing objects left hand bounds "UPDATE NestedSet " + "SET " + "lft = CASE " + " WHEN lft > (SELECT F1.lft FROM NestedSet as F1 WHERE F1.nid = " + destination + " ) THEN lft + 2 " + // Set new objects right hand bounds " WHEN lft = 0 " + " THEN (SELECT F1.lft + 1 FROM NestedSet as F1 WHERE F1.nid = " + destination + " ) ELSE lft " + " END, " + // Set existing objects right hand bounds " rft = CASE " + " WHEN rft > (SELECT F1.lft FROM NestedSet as F1 WHERE F1.nid = " + destination + " ) THEN rft + 2 " + // Set new objects right hand bounds " WHEN lft = 0 " + " THEN (SELECT F2.lft + 2 FROM NestedSet as F2 WHERE F2.nid = " + destination + " ) ELSE rft " + " END "); // logger.info( buffer.toString() ); // for debugging c.createStatement().execute(buffer.toString()); } /** * Inserts and then synchronizes the NestedSets * across the table. * * @param a map of relevent information. * @throws SQLException Thrown if something is wrong with the connection. */ public void insertNode( Map map ) throws SQLException { PreparedStatement statement; long parentId = ((Long) map.get("parentId")).longValue(); long id = ((Long) map.get("id")).longValue(); statement = c.prepareStatement( INSERT_NODE ); //////////////////////////////////////////////// // Sets up parentId and the objects id // //////////////////////////////////////////////// statement.setLong(1, id ); statement.setLong(2, parentId ); statement.setLong(3, parentId ); statement.setLong(4, parentId ); statement.setLong(5, parentId ); statement.setLong(6, parentId ); statement.execute(); } /** * A debugging method. * */ public void printTable() throws SQLException { ResultSet r = null ; r = c.createStatement().executeQuery("SELECT * FROM nestedSet") ; StringBuffer s = new StringBuffer(512); s.append( "-------------------------------------\n" ); s.append( "nid\tpid\tleft\tright\n" ); s.append( "-------------------------------------\n" ); while (r.next()){ s.append ( r.getString( "nid" ) ).append ( "\t" ); s.append ( r.getString( "parentId" ) ).append ( "\t" ); s.append ( r.getString( "lft" ) ).append ( "\t" ); s.append ( r.getString( "rft" ) ).append ( "\t\n" ); } logger.info( "\n\n" + s.toString() ) ; } /////////////////////////////////////////////////////////////////////////////////// // // Handy little methods. // /////////////////////////////////////////////////////////////////////////////////// /** * Useful to maintain the index. */ public void closeAllGaps(){ // someday someone will write me.... } /** * Closes a gap between the points of a root object. While not * doing this does not break things, the nestedset indexing will become harder to * understand. * * @return the number of rows affected. * @param the root of the gap. * @throws SQLException on horrible failure. * */ public int closeGap(Long root) throws SQLException { return c.createStatement().executeUpdate ( // Find the gap. " DECLARE @LEFT BIGINT, @RIGHT BIGINT " + " SELECT @LEFT = lft, @RIGHT = rft " + " FROM NestedSet " + " WHERE nid = " + root.toString() // Close the gap. + " UPDATE NestedSet " + " SET " + " lft = CASE " + " WHEN lft > @LEFT " + " THEN lft - ( @RIGHT - @LEFT + 1 ) " + " ELSE lft END, " + " rft = CASE " + " WHEN rft > @LEFT " + " THEN rft - ( @RIGHT - @LEFT + 1 ) " + " ELSE rft END; " ); } /** * @params Long node. */ public void deleteNode(Long node){ throw new RuntimeException ("NestedSet.deleteNode(Long node) not implemented yet"); } /** * Returns map * * * */ public List getPathBetweenNodes(Long start, Long end) throws SQLException { ResultSet r = c.createStatement().executeQuery( " SELECT t2.nid,( " + " SELECT COUNT(*) " + " FROM NestedSet as T4 " + " WHERE t4.lft BETWEEN t1.lft AND t1.rft " + " AND t2.lft BETWEEN t4.lft AND t4.rft ) AS path " + " FROM nestedSet as T1, nestedSet as T2, nestedSet as T3 " + " WHERE t1.nid = " + start + " AND t3.nid = " + end + " AND t2.lft BETWEEN t1.lft AND t1.rft " + " AND t3.lft BETWEEN t2.lft AND t2.rft ") ; // you know, i used to work. i can't remember why this is broten. return null; } // // // ------------------------------------ // // /** * Useful to maintain the index. */ public void makeSpace(){ } public int getHeight(){ return 0; } public List getChildrenAtLevel(int level){ return null; } public List getImmediateChildren(){ return null; } public List getLeaves() { return null; } public List getPath() { return null; } }