relation - Operators for Relation Values
This manpage describes the relation command. The relation command is part of the Tcl Relational Algebra Library (ral) package. The relation command defines a set of operators on relations that constitute the formal algebra of relations that is supported by TclRAL. The algebra is patterned after that described by Date (Date, C.J., An Introduction to Database Systems, 8th ed, Pearson Education, 2004, ISBN 0-321-19784-4, chap. 7)
Formally a relation has a heading and a body. The heading, like a tuple heading, defines the attribute names and data types of the components of the relation. The body consists of tuples, all of which match the heading. Because relations are fundamentally sets, several things should be noted:
There is no left to right ordering of the attributes. Like tuples, relations have no real left to right ordering. The implementation may choose to store attributes in any order and users should not depend upon certain string representations nor operate on relations with any operators other than those given here.
There is no top to bottom ordering of tuples. Again the implementation is free to store tuples in any order it chooses. There are no access operations that depend upon a tuple at some fixed offset or location.
There are no duplicate tuples in the body of a relation value and there is no concept of a NULL value. Relations are sets and sets do not have duplicates. All attributes of all tuples of a relation must contain a valid value of the data type defined for that attribute.
Note that all the operators are read only in the sense that they operate on relation values without modifying them. All of these operators return either a new relation or scalar value and do not modify the relation arguments. Only the relvar command can modify, in place, the relation value that is held in a relation variable.
Like any Tcl value, relations have a string representation. The string representation of a relations is a specially formatted list consisting of two elements:
The relation heading.
The relation body.
The relation heading is a list containing an even number of elements consisting of alternating attribute names and attribute data types. It is the same form as a tuple heading.
The body of a relation consists of a list of tuples. Each tuple in the body is in a list containing an even number of elements consisting alternately of an attribute name and its corresponding attribute value.
The following is a literal string representation of a relation with one tuple in its body.
{DogName string Breed string} {{DogName Fido Breed Poodle}}
Several set-like operation are supported by TclRAL (e.g. union and intersection). These operations work in much the same way as the familiar mathematical operations on sets. When applied to relation values, there is a special consideration. In order to perform set-like operations on relations they must have the same heading. This means that the headings must have the same attribute names and the corresponding attributes must have the same data type.
In the command descriptions that follow, we will often refer to the example data given below. We assume that the relation is contained in a Tcl variable with the same name as the tables given below. For brevity we assume that the necessary commands have been imported into the namespace of the example.
% set DOG { {DogName string Breed string} { {DogName Fido Breed Poodle} {DogName Sam Breed Collie} {DogName Spot Breed Terrier} {DogName Rover Breed Retriever} {DogName Fred Breed Spaniel} {DogName Jumper Breed Mutt} } } % relformat $DOG DOG +-------+---------+ |DogName|Breed | |string |string | +-------+---------+ |Fido |Poodle | |Sam |Collie | |Spot |Terrier | |Rover |Retriever| |Fred |Spaniel | |Jumper |Mutt | +-------+---------+ DOG --- % set OWNER { {OwnerName string Age int City string} { {OwnerName Sue Age 24 City Cupertino} {OwnerName George Age 35 City Sunnyvale} {OwnerName Alice Age 30 City {San Jose}} {OwnerName Mike Age 50 City {San Jose}} {OwnerName Jim Age 42 City {San Francisco}} } } % relformat $OWNER OWNER +---------+---+-------------+ |OwnerName|Age|City | |string |int|string | +---------+---+-------------+ |Sue |24 |Cupertino | |George |35 |Sunnyvale | |Alice |30 |San Jose | |Mike |50 |San Jose | |Jim |42 |San Francisco| +---------+---+-------------+ OWNER ----- % set OWNERSHIP { {OwnerName string DogName string Acquired string} { {OwnerName Sue DogName Fido Acquired 2001} {OwnerName Sue DogName Sam Acquired 2000} {OwnerName George DogName Fido Acquired 2001} {OwnerName George DogName Sam Acquired 2000} {OwnerName Alice DogName Spot Acquired 2001} {OwnerName Mike DogName Rover Acquired 2002} {OwnerName Jim DogName Fred Acquired 2003} } } % relformat $OWNERSHIP OWNERSHIP +---------+-------+--------+ |OwnerName|DogName|Acquired| |string |string |string | +---------+-------+--------+ |Sue |Fido |2001 | |Sue |Sam |2000 | |George |Fido |2001 | |George |Sam |2000 | |Alice |Spot |2001 | |Mike |Rover |2002 | |Jim |Fred |2003 | +---------+-------+--------+ OWNERSHIP ---------
The array subcommand converts the relation value given by relationValue into an array variable named arrayVarName. The values of keyAttr are used as the array indices and the corresponding values of valueAttr are the array value. Note, that if the values of the keyAttr attribute are not unique then it is possible that more than one tuple in relationValue will have the same value for keyAttr. In this case the array value from valueAttr corresponding to duplicated values of keyAttr is arbitrarily chosen by the implementation. Thus the safe approach is to project the relation value that is to be converted into an array to be a binary relation or to tag the relation and use the tag attribute name as keyAttr thereby insure that no duplicated array indices can occur.
% relation array $DOG dogarray DogName Breed % parray dogarray dogarray(Fido) = Poodle dogarray(Fred) = Spaniel dogarray(Jumper) = Mutt dogarray(Rover) = Retriever dogarray(Sam) = Collie dogarray(Spot) = Terrier
The assign subcommand is short hand for tuple assign [relation tuple relationValue] ?...?. This allow moving the values of a relation with a single tuple into ordinary Tcl variable.
% relation foreach d $DOG -descending DogName { relation assign $d DogName puts $DogName } Spot Sam Rover Jumper Fred Fido
The attributes subcommand returns the names of the attributes of relationValue as a list.
% relation attributes $DOG DogName Breed % relation attributes $OWNER OwnerName Age City
The body subcommand returns the body of the relation value given by relationValue. The body is a list of tuple values, each tuple value being in turn a list of attribute name / attribute value pairs. The order of the tuple values in the body list is arbitrary.
% foreach t [relation body $DOG] {puts $t} DogName Fido Breed Poodle DogName Sam Breed Collie DogName Spot Breed Terrier DogName Rover Breed Retriever DogName Fred Breed Spaniel DogName Jumper Breed Mutt
The cardinality subcommand returns the number tuples contained in the body of relationValue.
% relation cardinality $DOG 6 % relation cardinality $OWNERSHIP 7
The compose subcommand computes the join of relationValue1 and relationValue2 but eliminates all of the attributes used in the join. The returned relation has a heading the same as the union of the headings of relationValue1 and relationValue2 minus all of the join attributes from the two relations. The returned relation has a body that is the same as that of the join minus any duplicated tuples that result when the join attributes are eliminated. As with join, if the -using argument are missing, the join is computed across the attributes in relationValue1 and relationValue2 that are named the same. Otherwise the attrList argument is treated the same as for the join subcommand.
The create subcommand returns the relation value described by the header and tupleN arguments. The header argument is a list of attribute name / attribute type pairs. Any tuples given by the ?tupleN? arguments are inserted into the relation value. Each tupleN argument is a list of attribute name / attribute value pairs. This command allows for creating a relation value without having knowledge of the string representation of a relation value. Compare with the ::ral::tuple create command.
% relation create {DogName string Breed string} {DogName Fido Breed Poodle} {DogName string Breed string} {{DogName Fido Breed Poodle}}
The degree subcommand returns the number of attributes in the heading of relationValue
% relation degree $DOG 2
The dict subcommand returns a dictionary object whose keys are the values of the keyAttr and whose values are the values of the valueAttr. Like the relation array command, the values of the keyAttr attribute should be unique to be assured that the correct corresponding values of the valueAttr attribute are in the resulting dictionary.
% dict for {name breed} [relation dict $::DOG DogName Breed] { puts "$name ==> $breed" } Fred ==> Spaniel Sam ==> Collie Rover ==> Retriever Spot ==> Terrier Fido ==> Poodle Jumper ==> Mutt
The divide subcommand implements the relational divide operation. The headings of dividend and divisor must be disjoint and the heading of mediator must be the union of the dividend and divisor headings. The returned result is a new relation that has the same heading as dividend and contains all the tuples from dividend whose corresponding tuples in mediator include all the tuples in divisor. Stated another way, the result of divide subcommand is the maximal set of tuples from dividend whose Cartesian product with divisor is completely contained in mediator. Using the data from our ongoing example, then the following example shows how division can be used to find all the Dogs owned by both Sue and George.
% set dividend [relation project $::DOG DogName] {DogName string} {\ {DogName Fido}\ {DogName Sam}\ {DogName Spot}\ {DogName Rover}\ {DogName Fred}\ {DogName Jumper}} % puts [relformat $dividend Dividend:] +-------+ |DogName| |string | +-------+ |Fido | |Sam | |Spot | |Rover | |Fred | |Jumper | +-------+ Dividend: --------- % set divisor [relation project [relation restrict $::OWNER t\ {[tuple extract $t OwnerName] eq "Sue" ||\ [tuple extract $t OwnerName] eq "George"}] OwnerName] {OwnerName string} {{OwnerName Sue} {OwnerName George}} % puts [relformat $divisor Divisor:] +---------+ |OwnerName| |string | +---------+ |Sue | |George | +---------+ Divisor: -------- % set mediator [relation eliminate $::OWNERSHIP Acquired] {OwnerName string DogName string} {\ {OwnerName Sue DogName Fido}\ {OwnerName Sue DogName Sam}\ {OwnerName George DogName Fido}\ {OwnerName George DogName Sam}\ {OwnerName Alice DogName Spot}\ {OwnerName Mike DogName Rover}\ {OwnerName Jim DogName Fred}} % puts [relformat $mediator Mediator:] +---------+-------+ |OwnerName|DogName| |string |string | +---------+-------+ |Sue |Fido | |Sue |Sam | |George |Fido | |George |Sam | |Alice |Spot | |Mike |Rover | |Jim |Fred | +---------+-------+ Mediator: --------- % set quotient [relation divide $dividend $divisor $mediator] {DogName string} {{DogName Fido} {DogName Sam}} % puts [relformat $quotient "All dogs owned by both Sue and George"] +-------+ |DogName| |string | +-------+ |Fido | |Sam | +-------+ All dogs owned by both Sue and George -------------------------------------
The dunion subcommand returns the set union of two or more relations. However, unlike the relation union command, the dunion command insists that all the relation values be disjoint. All relations must be of the same type. The result relation has a heading that is the same as any of the arguments and has a body consisting of all tuples present in any of the relationValue arguments. However, if any duplicated tuples are found, then an error results. Since the dunion operation is both associative and commutative, the order of the relationValue arguments has no effect the result.
The eliminate subcommand returns a new relation that has a heading equal to that of the relationValue minus any attributes whose names are given in the attribute arguments. The body of the new relation is the same as the body of relationValue removing any tuples that might be duplicated as a result of removing the attributes. The eliminate subcommand complements the project command in the sense that eliminate specifies which attributes to discard and project specifies which attribute to keep.
% relformat [relation eliminate $::DOG Breed] +-------+ |DogName| |string | +-------+ |Fido | |Sam | |Spot | |Rover | |Fred | |Jumper | +-------+
The emptyof subcommand returns a new relation that has the same heading as relationValue but whose cardinality is zero.
The extend subcommand returns a new relation which has the same heading as relationValue with zero or more additional attributes. The first additional attribute is given by attr1 which has type type1 and its value is set to the result returned by passing expr1 to the expr command. Subsequent attributes are treated similarly. As each tuple in the body of relationValue is considered, its value is set into the variable whose name is given by the tupleVariable argument. This variable is accessible to the extending expressions so that the current tuple values of relationValue are available for computing the values of the new attributes.
% relformat [relation extend $::OWNER o AgeInMonths int {[tuple extract $o Age] * 12}] +---------+---+-------------+-----------+ |OwnerName|Age|City |AgeInMonths| |string |int|string |int | +---------+---+-------------+-----------+ |Sue |24 |Cupertino |288 | |George |35 |Sunnyvale |420 | |Alice |30 |San Jose |360 | |Mike |50 |San Jose |600 | |Jim |42 |San Francisco|504 | +---------+---+-------------+-----------+
The extract subcommand is short hand for tuple extract [relation tuple relationValue] attrName ?...?. This command obtains one or more attribute values from a relation that is of cardinality one.
The foreach subcommand provides a means to iterate through the body of a relation. For each tuple in the body of relationValue, the relationVariable variable is successively set to a relation value that contains a single tuple from relationValue and then script is executed. The order in which the tuples are considered is unspecified unless a list of sorting attributes is specified by the attr-list argument. In this case, the tuples are visited in the -ascending order of the values of the sorting attibutes if the direction option of -ascending is supplied or if no direction option is given. Tuples can be visited in descending order of the sorting attributes if the -descending option is given.
% relation foreach d $DOG -descending DogName { puts [tuple extract [relation tuple $d] DogName] } Spot Sam Rover Jumper Fred Fido
The fromdict subcommand returns a relation value of degree two that contains all the keys and values from dictValue. The heading of the returned relation will have two attributes names keyattr and valattr of the corresponding types, keytype and valtype. All the key / value pairs from the dictionary are inserted as tuples in the returned relation value. This command provides a simpler means to move data from core Tcl types into relation values. Compare with relation dict which extracts relation attribute values into Tcl dictionaries.
% set d [dict create Fido Poodle Sam Collie Spot Terrier] % relformat [relation fromdict $d DogName string Breed string] +-------+-------+ |DogName|Breed | |string |string | +-------+-------+ |Fido |Poodle | |Sam |Collie | |Spot |Terrier| +-------+-------+
The fromlist subcommand returns a relation value of degree one that contains the values in the list given by listValue. The heading of the returned relation contains a single attribute named, attrname, whose type is given by, attrtype. Since relations are sets and sets do not have duplicated values, any duplicated values in listValue are silently excluded from the returned relation value. Comparing the cardinality of the returned relation with the length of listValue can be used to determine of any elements of listValue were excluded. This command provides a simpler means to convert lists into a relation value. Compare with relation list which extracts relation attribute values into Tcl lists.
The group subcommand creates a new relation from relationValue that contains an attribute which is of type Relation. The newattribute argument gives the name of the relation valued attribute and the attrN arguments give the names of attributes in relationValue that are to be assembled into a relation value. The returned relation has a heading that is the same as relationValue minus the attributes given in the attrN arguments plus the new relation attribute given by newattribute. The resulting body has tuples for each unique set of values for the remaining ungrouped attributes in the original relation with the corresponding part of the tuple placed in the new relation valued attribute. See also the ungroup subcommand for the complementary operation.
% relformat [relation group $::OWNERSHIP DogAcquisition DogName Acquired] +---------+------------------+ |OwnerName|DogAcquisition | |string |Relation | +---------+------------------+ |Sue |+-------+--------+| | ||DogName|Acquired|| | ||string |string || | |+-------+--------+| | ||Fido |2001 || | ||Sam |2000 || | |+-------+--------+| |George |+-------+--------+| | ||DogName|Acquired|| | ||string |string || | |+-------+--------+| | ||Fido |2001 || | ||Sam |2000 || | |+-------+--------+| |Alice |+-------+--------+| | ||DogName|Acquired|| | ||string |string || | |+-------+--------+| | ||Spot |2001 || | |+-------+--------+| |Mike |+-------+--------+| | ||DogName|Acquired|| | ||string |string || | |+-------+--------+| | ||Rover |2002 || | |+-------+--------+| |Jim |+-------+--------+| | ||DogName|Acquired|| | ||string |string || | |+-------+--------+| | ||Fred |2003 || | |+-------+--------+| +---------+------------------+
The heading subcommand returns the relation heading of the relationValue. The heading is a list consisting attributes and corresponding data types.
% relation heading $DOG DogName string Breed string
The insert command returns a new relation value whose heading is the same as relationValue and whose body includes all the tuples of relationValue plus all the tuples given as the name-value-listN arguments. If inserting any of the name-value-listN arguments would result in a duplicate tuple, then an error is thrown.
The intersect subcommand returns the set intersection of two or more relations. All relations have equal headings. The result relation has a heading that is the same as any of the arguments and has a body consisting of all tuples present in all of the relationValue arguments. Since the intersection operation is both associative and commutative, the order of the relationValue arguments has no effect the result.
The is subcommand returns a boolean value based on performing a comparison operation between two relation values. Allowed values of compareop are:
Returns 1 if relationValue1 is equal to relationValue2.
Returns 1 if relationValue1 is not equal to relationValue2.
Returns 1 if relationValue1 is a proper subset of relationValue2.
Returns 1 if relationValue1 is a subset of relationValue2.
Returns 1 if relationValue1 is a proper superset of relationValue2.
Returns 1 if relationValue1 is a superset of relationValue2.
Both relationValue1 and relationValue2 must be of the same type to be compared.
% set sDogs [relation restrictwith $DOG {[string match S* $DogName]}] {DogName string Breed string} {{DogName Sam Breed Collie} {DogName Spot Breed Terrier}} % relation is $DOG == $sDogs 0 % relation is $DOG != $sDogs 1 % relation is $DOG supersetof $sDogs 1 % relation is $DOG > $sDogs 1 % relation is $DOG propersubsetof $sDogs 0 % relation is $DOG <= $sDogs 0
The isempty subcommand returns the boolean value "1" if the body of relationValue does not contain any tuples and "0" otherwise. The isempty subcommand is a convenient short hand for the the command expr {[::ral::relation cardinality relationValue] == 0}.
The isnotempty subcommand returns the boolean value "1" if the body of relationValue contains any tuples and "0" otherwise. The isnotempty subcommand is a convenient short hand for the the command expr {[::ral::relation cardinality relationValue] != 0}.
The issametype subcommand returns the boolean value "1" if the headings of relationValue1 is the same as that of relationValue2 and "0" otherwise. Since the relation headings do not have any intrinsic ordering, using string comparison of relation headings to determine equality is not reliable and therefore the issametype command should be used to determine type equality.
The join subcommand performs the natural join of relationValue1 with relationValue2. The join is performed across the given attrList. If no attrList are given, then the join is performed across attributes with the same name in both relations (because of the rename subcommand it is always possible to arrange for the relations to have common names for the join attributes). If present, the attrList is a list of pairs of attribute names. The even numbered elements give the name of a join attribute in relationValue1 and the odd numbered elements give the name of the corresponding join attribute in relationValue2. The heading of the resulting relation consists of all the attibutes of relationValue1 plus all the attributes of relationValue2 minus any of the join attributes from relationValue2. The body of the resulting relation consists all tuples composed from the tuples in both relationValue1 and relationValue2 where the values of the join attributes in relationValue1 equal those of relationValue2. Several relations can be joined by optionally listing them as additional relationValueN arguments. Additional ?-using? arguments may be given for each pair of relationValue arguments to specify the attributes across which the join is to be performed. The ?-using? options are a convenient short hand for having to rename attributes to achieve common attribute names. The join operation is both associative and commutative and so the order of the relationValueN does not affect the result.
% relformat [relation join $OWNERSHIP $DOG] +---------+-------+--------+---------+ |OwnerName|DogName|Acquired|Breed | |string |string |string |string | +---------+-------+--------+---------+ |Sue |Fido |2001 |Poodle | |Sue |Sam |2000 |Collie | |George |Fido |2001 |Poodle | |George |Sam |2000 |Collie | |Alice |Spot |2001 |Terrier | |Mike |Rover |2002 |Retriever| |Jim |Fred |2003 |Spaniel | +---------+-------+--------+---------+
The list subcommand returns the values of a single attribute from all the tuples of relationValue as a proper Tcl list. If attrName is not given, then RelationValue must be of degree one or an error is returned. Otherwise the attribute values of attrName are returned. Since lists are ordered and tuples in a relation values are not, an optional sortAttrList may be specified. In this case, the returned list is in the same order as the corresponding values of the attributes given in sortAttrList. By default the sorting is -ascending unless otherwise specified. If no sorting arguments are given, the returned order is arbitrary and unspecified. Note that the list returned from a relation of degree one is necessarily a set and may be used with the struct::set package. In this case if the values of the attrName attribute are not unique then, in general, the returned list is not a set and may include duplicated values.
% relation list [relation project $OWNERSHIP Acquired]] 2001 2000 2002 2003 % relation list $OWNERSHIP Acquired -ascending {OwnerName DogName} 2001 2001 2000 2003 2002 2001 2000
The minus subcommand returns the set difference between two relations. The relationValue arguments must be of the same type and that is the type of the result relation. The body of the result consists of those tuples present in relationValue1 but not present in relationValue2. Note that the order of the relationValue1 and relationValue2 arguments is significant to the result, i.e. minus is not a commutative operation.
The project subcommand returns a relation whose heading consists of only those attributes listed in the attrN arguments. The body of the result consists of tuples the corresponding tuples from relationValue, removing any duplicates created by considering only a subset of the attributes.
% relformat [relation project $OWNER City] +-------------+ |City | |string | +-------------+ |Cupertino | |Sunnyvale | |San Jose | |San Francisco| +-------------+
The rank subcommand returns a new relation whose heading is the same as relationValue extending by an attribute named newAttr. The type of newAttr will be int and its value will be set to the number of tuples in relationValue where the value of rankAttr is less than or equal to (?-descending?) or greater than or equal to (?-ascending?) that of the given tuple. The default ranking is -ascending. The type of rankAttr must be int, double, or string. The rank command is useful when it is desirable to limit the number of tuples in the result. For example, to find the names and ages of the two oldest owners:
% set ro [relation rank $OWNER -descending Age AgeRank] {OwnerName string Age int City string AgeRank int} {\ {OwnerName Sue Age 24 City Cupertino AgeRank 5}\ {OwnerName George Age 35 City Sunnyvale AgeRank 3}\ {OwnerName Alice Age 30 City {San Jose} AgeRank 4}\ {OwnerName Mike Age 50 City {San Jose} AgeRank 1}\ {OwnerName Jim Age 42 City {San Francisco} AgeRank 2}} % relformat [relation project [relation restrictwith $ro {$AgeRank <= 2}] OwnerName Age] +---------+---+ |OwnerName|Age| |string |int| +---------+---+ |Mike |50 | |Jim |42 | +---------+---+
The rename subcommand returns a relation whose heading has each oldname attribute changed to newname and whose body is the same as relationValue. An arbitrary number of oldname / newname pairs may be given. Renaming is processed from left to right in the command arguments and it is not an error to change the name of any given attribute multiple times. However, oldname must always be the name of an attribute at the time the rename takes place. The rename subcommand is useful for manipulating the attribute names of a relation to prevent naming conflicts in certain operations (e.g. times and join).
% relformat [relation rename $OWNER City Town] +---------+---+-------------+ |OwnerName|Age|Town | |string |int|string | +---------+---+-------------+ |Sue |24 |Cupertino | |George |35 |Sunnyvale | |Alice |30 |San Jose | |Mike |50 |San Jose | |Jim |42 |San Francisco| +---------+---+-------------+ % relformat [relation rename $OWNER City Town City Village] unknown attribute name, "City"
The restrict subcommand returns a new relation that is a subset (possibly improper) of relationValue. Each tuple in the body of relationValue is successively assigned to tupleVariable and expression is evaluated. The resulting relation has a heading that is the same as relationValue and a body consisting of all tuples where expression evaluated to true.
% relformat [relation restrict $DOG d {[string match S* [tuple extract $d DogName]]}] +-------+-------+ |DogName|Breed | |string |string | +-------+-------+ |Sam |Collie | |Spot |Terrier| +-------+-------+
The restrictwith subcommand behaves like the restrict subcommand except that the values of the tuple attributes are placed in Tcl variables that have the same name as the attributes. Each tuple in the body of relationValue is successively considered and each attribute of the tuple is assigned to a Tcl variable that is the same name as the attribute and expression is evaluated. The resulting relation has a heading that is the same as relationValue and a body consisting of all tuples where expression evaluated to true. The restrictwith subcommand is roughly equal to performing a tuple assign command on each tuple in relationValue before evaluating expression. The attribute variables are created if needed and will overwrite the values of any variables that are named the same. Also, the variables are deleted after the command completes. Also note that restrictwith will not function correctly if nested with a more complex expression that also contains a restrictwith command for a relationValue with the same heading. In this case the invocation of one restrictwith may very well overwrite the variables of a separate invocation. With that restriction in mind, the restrictwith subcommand is often more convenient than performing all the required tuple extract commands that are otherwise required.
% relformat [relation restrictwith $DOG {[string match S* $DogName]}] +-------+-------+ |DogName|Breed | |string |string | +-------+-------+ |Sam |Collie | |Spot |Terrier| +-------+-------+ % puts $DogName can't read "DogName": no such variable
The semijoin subcommand computes the join of relationValue1 and relationValue2 but eliminates all of the attributes of relationValue1 (or alternatively speaking, projecting all attributes of relationValue2). The returned relation has a heading the same as relationValue2 and a body consisting of those tuples in relationValue2 that would have been included in the natural join with relationValue1. As with join, if the -using argument are missing, the join is computed across the attributes in relationValue1 and relationValue2 that are named the same. Otherwise the attrList argument is treated the same as for the join subcommand. Also like the join subcommand, additional relationValue arguments may be given and the result is computed in left to right order. This implies that the type of the result is always the type of the right most relationValue. N.B. the sense of this command is inverted from previous versions of this library. For example, to find all the dogs that have some owner:
% relformat [relation semijoin $OWNERSHIP $DOG] +-------+---------+ |DogName|Breed | |string |string | +-------+---------+ |Fido |Poodle | |Sam |Collie | |Spot |Terrier | |Rover |Retriever| |Fred |Spaniel | +-------+---------+
The semiminus subcommand computes the difference between relationValue2 and the semijoin of relationValue1 and relationValue2. The returned relation has a heading equal to that of relationValue2 and a body consisting of those tuples from relationValue2 which would not have been included in the natural join of relationValue1 and relationValue2. The optional -using argument is treated in the same way as for the join subcommand. Also like the semijoin subcommand, additional relationValue arguments may be given. N.B. the sense of this command is inverted from previous versions of this library. For example, to find all dogs that have no owner:
% relformat [relation semiminus $OWNERSHIP $DOG] +-------+------+ |DogName|Breed | |string |string| +-------+------+ |Jumper |Mutt | +-------+------+
The summarize subcommand allows for computations across sets of tuples. The perRelation must be of the same type as a projection of relationValue. The returned relation has the same heading as perRelation extended by attr. The attr gives the name of the new attribute. The name of the new attribute may not match any existing attributes in perRelation. The type argument is data type of the new attribute. Several new attributes may be added in one summarize command. The body of the returned relation consists of all the tuples of perRelation with values for the new attributes. Those values are computed by evaluating the summmary arguments as an expression. Before evaluating expression for a tuple in perRelation, relationVarName is set to a relation value consisting of all the tuples in relationValue whose attributes match the corresponding attributes in perRelation. The expression expression may then access the value of that relation using the given relationVarName variable name. The expression must return a result that matches the type of the new attribute. In this example we determine the years when dogs were acquired and the number of dogs acquired in each year.
% set ac [relation project $OWNERSHIP Acquired] {Acquired string} {{Acquired 2001} {Acquired 2000} {Acquired 2002} {Acquired 2003}} % relformat [relation summarize $OWNERSHIP $ac s NumAcquired int {[relation cardinality $s]}] +--------+-----------+ |Acquired|NumAcquired| |string |int | +--------+-----------+ |2001 |3 | |2000 |2 | |2002 |1 | |2003 |1 | +--------+-----------+
The summarize command can also be used to perform more complicated computations. In the case of summarizing over the dee relation gives the ability to compute overall totals or averages. In this example we compute the average age of the owners.
proc ravg {rel attr} { set sum 0 relation foreach r $rel { tuple assign [relation tuple $r] incr sum $Age } return [expr {$sum / [relation cardinality $rel]}] } % set dee {{} {{}}} {} {{}} % relformat [relation summarize $OWNER $dee o AverageAge int {[ravg $o Age]}] +----------+ |AverageAge| |int | +----------+ |36 | +----------+
The summarizeby subcommand is a more convenient form of summarize where the per relation is a projection of the relation value that is to be summarized. Rather than supplying a per relation, instead a list of attributes is given by the attrList argument and relationValue is projected on those attributes and used as the per relation. The arguments and results are otherwise the same as for the summarize command.
% relformat [relation summarizeby $OWNERSHIP Acquired s\ NumAcquired int {[relation cardinality $s]}] +--------+-----------+ |Acquired|NumAcquired| |string |int | +--------+-----------+ |2001 |3 | |2000 |2 | |2002 |1 | |2003 |1 | +--------+-----------+
The table subcommand creates a new relation whose heading is given by heading and tuples of the body are given by the value-listN arguments. The heading is an list containing an even number of elements consisting of alternating attribute names and attribute data types. The value-listN arguments are a list of attribute values. It is assumed that the attribute values are given in the same order as the attributes of the heading argument. This command is a convenience command that is much like relation create but avoids having to give the attribute name for each attribute value when specifying the tuples of the body. This allows literal relation values to be specified in a tabular layout.
set o [relation table {\ OwnerName string Age int City string } { Sue 24 Cupertino } { George 35 Sunnyvale } { Alice 30 {San Jose} }]
The tag subcommand creates a new relation which has the same heading as relationValue extended by a new attribute named attrName. The type of attrName will be int and will have the values between 0 and the cardinality of relationValue minus one. The tuples in relationValue will be extended in either ascending or descending order of the sort-attr-list. If no sort-attr-list argument is given, then the tagging order is arbitrary. If the ?-within? argument is given then the values of attrName attribute will be unique within the subset of tuples which match the values of attr-list. The tag command is useful when a full ordering needs to be placed on a relation. For example, tagging a relation will allow projecting both the tag and another attribute without losing any values.
% relformat [relation tag $DOG -ascending DogName DogId] +-------+---------+-----+ |DogName|Breed |DogId| |string |string |int | +-------+---------+-----+ |Fido |Poodle |0 | |Fred |Spaniel |1 | |Jumper |Mutt |2 | |Rover |Retriever|3 | |Sam |Collie |4 | |Spot |Terrier |5 | +-------+---------+-----+ % relformat [relation tag $OWNERSHIP -ascending OwnerName -within DogName DogId] +---------+-------+--------+-----+ |OwnerName|DogName|Acquired|DogId| |string |string |string |int | +---------+-------+--------+-----+ |Alice |Spot |2001 |0 | |George |Fido |2001 |0 | |George |Sam |2000 |0 | |Jim |Fred |2003 |0 | |Mike |Rover |2002 |0 | |Sue |Fido |2001 |1 | |Sue |Sam |2000 |1 | +---------+-------+--------+-----+
The tclose subcommand returns a new relation value that is the transitive closure of relationValue. The relationValue must be a binary relation and each attribute must be of the same type. The returned relation has a heading that is the same as relationValue. The pairs of attribute values in relationValue may be considered as defining the edges of a graph. The body of the transitive closure will contain all pairs of the two attributes where there exists some path between the vertices in the implied graph. The original relationValue is necessarily a subset of the transitive closure relation. The tclose subcommand often useful when there is a hierarchical relationship among the data values. Consider the employee / supervisor reporting relationship. If we are interested in all people who report directly or indirectly to John, then we have:
% set REPORTS_TO { {Employee string Supervisor string} { {Employee Sue Supervisor John} {Employee Bob Supervisor John} {Employee Jane Supervisor Sue} {Employee Joe Supervisor Sue} {Employee Carl Supervisor Bob} {Employee James Supervisor Alice} } } % relformat [relation project [relation restrictwith\ [relation tclose $REPORTS_TO] {$Supervisor eq "John"}] Employee]\ "Direct/Indirect Reports to John" +--------+ |Employee| |string | +--------+ |Sue | |Bob | |Jane | |Joe | |Carl | +--------+ Direct/Indirect Reports to John -------------------------------
The times subcommand returns the extended Cartesian product of two or more relations. The heading of the result is the union of the headings of all the relationValue arguments. The body of the result consists of a tuple for all combinations of the tuples in all the relationValue arguments. Since the relational multiplication operation is both associative and commutative, the order of the relationValue arguments has no effect on the result.
% relformat [relation times $DOG $OWNER] +-------+---------+---------+---+-------------+ |DogName|Breed |OwnerName|Age|City | |string |string |string |int|string | +-------+---------+---------+---+-------------+ |Fido |Poodle |Sue |24 |Cupertino | |Fido |Poodle |George |35 |Sunnyvale | |Fido |Poodle |Alice |30 |San Jose | |Fido |Poodle |Mike |50 |San Jose | |Fido |Poodle |Jim |42 |San Francisco| |Sam |Collie |Sue |24 |Cupertino | |Sam |Collie |George |35 |Sunnyvale | |Sam |Collie |Alice |30 |San Jose | |Sam |Collie |Mike |50 |San Jose | |Sam |Collie |Jim |42 |San Francisco| |Spot |Terrier |Sue |24 |Cupertino | |Spot |Terrier |George |35 |Sunnyvale | |Spot |Terrier |Alice |30 |San Jose | |Spot |Terrier |Mike |50 |San Jose | |Spot |Terrier |Jim |42 |San Francisco| |Rover |Retriever|Sue |24 |Cupertino | |Rover |Retriever|George |35 |Sunnyvale | |Rover |Retriever|Alice |30 |San Jose | |Rover |Retriever|Mike |50 |San Jose | |Rover |Retriever|Jim |42 |San Francisco| |Fred |Spaniel |Sue |24 |Cupertino | |Fred |Spaniel |George |35 |Sunnyvale | |Fred |Spaniel |Alice |30 |San Jose | |Fred |Spaniel |Mike |50 |San Jose | |Fred |Spaniel |Jim |42 |San Francisco| |Jumper |Mutt |Sue |24 |Cupertino | |Jumper |Mutt |George |35 |Sunnyvale | |Jumper |Mutt |Alice |30 |San Jose | |Jumper |Mutt |Mike |50 |San Jose | |Jumper |Mutt |Jim |42 |San Francisco| +-------+---------+---------+---+-------------+
The tuple subcommand returns the body of relationValue as a tuple. The cardinality of relationValue must be one or an error is returned.
The uinsert subcommand operates in the same fashion as the insert subcommand, except that attempts to insert duplicate tuples are silently ignored. Thus the uinsert subcommand has union type semantics.
The ungroup subcommand maps relations that have relation valued attributes to relations that have scalar valued attributes. It is the complementary operation to the group command. The attribute argument must be the name of a relation valued attribute of relationValue. The returned relation has a heading consisting of all the attributes of relationValue minus attribute plus all the attributes of the attribute relation. The body of the returned relation consists of tuples composed of pairing each tuple from the attribute attribute with its corresponding components from relationValue.
set OwnerContacts { {OwnerName string Contact {Relation {Type string Number string} Type}} { {OwnerName Sue Contact { {Type Home Number 555-1212} {Type Work Number 555-6635}} } {OwnerName George Contact { {Type Home Number 555-8810} {Type Work Number 555-1177} {Type Mobile Number 555-3399}} } } } % relformat $OwnerContacts +---------+-----------------+ |OwnerName|Contact | |string |Relation | +---------+-----------------+ |Sue |+------+--------+| | ||Type |Number || | ||string|string || | |+------+--------+| | ||Home |555-1212|| | ||Work |555-6635|| | |+------+--------+| |George |+------+--------+| | ||Type |Number || | ||string|string || | |+------+--------+| | ||Home |555-8810|| | ||Work |555-1177|| | ||Mobile|555-3399|| | |+------+--------+| +---------+-----------------+ % relformat [relation ungroup $OwnerContacts Contact] +---------+------+--------+ |OwnerName|Type |Number | |string |string|string | +---------+------+--------+ |Sue |Home |555-1212| |Sue |Work |555-6635| |George |Home |555-8810| |George |Work |555-1177| |George |Mobile|555-3399| +---------+------+--------+
The union subcommand returns the set union of two or more relations. All relations must be of the same type. The result relation has a heading that is the same as any of the arguments and has a body consisting of all tuples present in any of the relationValue arguments. Since the union operation is both associative and commutative, the order of the relationValue arguments has no effect the result.
The unwrap subcommand operates on relation values in a manner similar to the way that the tuple unwrap command operates on tuple values. The attribute argument is the name of a tuple valued attribute. The result relation has the same heading as relationValue minus the attribute plus all the attributes contained in the tuple valued attribute, attribute. In essense the tuple valued attribute, attribute, is flattened by one level.
The update subcommand returns a new relation value that has the same heading as relationValue and whose tuple have been modified. Each tuple in relationValue is assigned to the Tcl variable given by tupleVarName and expr is evaluated. If expr is true then script is evaluated. The return value of script is added to the result relation and it must be a tuple that has the same heading as relationValue. If expr is false then the unmodified tuple from relationValue is added to the result. This command serves the same role for relation values as the relvar update command does for relvars.
The wrap subcommand operates on relation values in a manner similar to the way that the tuple wrap command operates on tuple values. The result relation has a heading that is the same as relationValue plus an attribute named, tupleattr, minus the attrN attributes. The values of the attrN attributes are converted into a tuple valued attribute that is stored in tupleattr.
% set r { {Name string Street string City string State string Zip string} { {Name Arnold Street Main City Sacramento State CA Zip 94444} {Name Mary Street Elm City Stockton State CA Zip 95555} } } % relformat [relation wrap $r Address Street City State Zip] +------+---------------------------------+ |Name |Address | |string|Tuple | +------+---------------------------------+ |Arnold|+------+----------+------+------+| | ||Street|City |State |Zip || | ||string|string |string|string|| | |+------+----------+------+------+| | ||Main |Sacramento|CA |94444 || | |+------+----------+------+------+| |Mary |+------+--------+------+------+ | | ||Street|City |State |Zip | | | ||string|string |string|string| | | |+------+--------+------+------+ | | ||Elm |Stockton|CA |95555 | | | |+------+--------+------+------+ | +------+---------------------------------+
Copyright © 2004 - 2010 by G. Andrew Mangogna