Semantic Hypergraph Notation¶
SH notation is based on these principles:
- The simplest SH structure is the atom, which typically corresponds to a single word and is also considered an hyperedge (e.g.
sky/C). - Atoms belong to one of six basic types: concept (
C), predicate (P), modifier (M), builder (B), trigger (T) and conjunction (J). - Atoms can be combined to form more complex structures, called hyperedges, for example:
(blue/M sky/C) - Non-atomic hyperedges can also be relations (
R) and specifications (S). - The first element of a non-atomic hyperedge is a connector, followed by arguments that can be atomic or non-atomic hyperedges. Only five types can be connectors: predicates (define relations), modifiers (modify anything), builders (combine two concepts), triggers (define specifiers) and conjunctions (combine anything).
- Predicates and builders have argument roles, which indicate the role of each argument following. Argument roles are encoded as characters following a dot, for example:
is/P.so. - Predicates have three possible argument roles:
s(subject),o(object) andx(specification). - Builders have two possible argument roles:
m(main) anda(auxiliary).
This is enough to represent any sentence in natural language, for example "The sky is blue":
SH notation offers two optional features: subtypes and namespaces.
Subtypes provide additional information about an atome type, for exemple maria/Cp indicates that this conceot is a proper noun. Unlike the main types, subtypes are meant to be arbitrarily extendable and can have any length, for example, we could define the subtype math and use it in a predicate such as: union/Pmath. SH natural language parsers are expected to produce single-character subtypes according to the tables provided below,
Namespaces allow us to distinguish atoms with the same root and type but that correspond to different things, for example: paris/Cp/1 and paris/Cp/2 to distinguish Paris (France) and Paris (Texas). They can also be used to distinguish the language from which an atom was extracted, for example: ciel/Cc/fr, céu/Cc/pt and sky/Cc/en. Like subtypes, namespaces are strings of lowercase alphanumeric characters of arbitrary length.
Then, the additional notational devices can be employed when useful. Of course, it is always possible to utilize full notation for machine tasks while presented a simplified version for human-friendliness.
Hyperedge types¶
All valid semantic hyperedges are of one of the eight types shown in the table below. The first six types can be explicit (directly annotating an atomic hyperedge) or implicit (inferred from the types of the elements of the hyperedge). The last two types are always implicit.
| Code | Type | Purpose | Example |
|---|---|---|---|
| Atomic or non-atomic | |||
| C | concept | Define atomic concepts | apple/C |
| P | predicate | Build relations | (is/P.so berlin/C nice/C) |
| M | modifier | Modify any other hyperedge type, including itself | (red/M shoes/C) |
| B | builder | Build concepts from concepts | (of/B.ma capital/C germany/C) |
| T | trigger | Build specifications | (in/T 1994/C) |
| J | conjunction | Define sequences of hyperedges | (and/J meat/C potatoes/C) |
| Non-atomic only | |||
| R | relation | Express facts, statements, questions, orders, ... | (is/P.so berlin/C nice/C) |
| S | specifier | Relation specification (e.g. condition, time, ...) | (in/T 1976/C) |
Type inference rules¶
The table below shows how implicit hyperedge types are inferred.
| Element types | Resulting type |
|---|---|
| (M x) | x |
| (B C C+) | C |
| (T [CR]) | S |
| (P [CRS]+) | P |
| (J x y+) | x |
We use the notation of regular expressions: the symbol + is used to denote one or more entities with the type that precedes it, while square brackets indicate several possibilities (for instance, [CRS]+ means "at least one of any of C, R or S" types). x means any type: (M x) is of type x.
Predicate argument roles¶
Predicates accept the three following argument roles:
- s: subject (at most one)
- o: object (at most one)
- x: specification (any number)
In the simplest terms: the subject is who or what does something, and the object is who or what it is done to. For example:
In the previous case of:
(is/P.so (the/M sky/C) blue/C)
"the sky" is the subject, being assigned the property of "blue". These s/o pairs map to the typical dyadic relationships in knolwedge graphs, with the predicate playing the role of connection label:
maria (subject) ---[plays]---> chess (object)
the sky (subject) ---[is]---> blue (object)
Specifications then allow for further constraints and context in a relation, for example in "Maria plays chess in the afternoon":
Specifications can be other relations, for example in "Maria plays chess when it rains":
Every argument filling the specification role (x) must be of type specifier (S). Specifiers are produced by triggers ((T [CR]) → S), so a specification is normally introduced by a trigger word, as in (in/Tt ...) or (when/Tt ...) above. When a specification has no natural trigger word in the surface text — for example a bare indirect object — the argument must still be turned into an S by enclosing it in the appropriate special trigger atom _/T*/. (see Special atoms). For instance, "Maria gave Peter a book" has Peter as a recipient with no preposition:
Here (_/Ti/. peter/Cp) is a specifier of indirect-object subtype, satisfying the requirement that the x argument be of type S.
Builders¶
In builders, the two argument roles are used to identify the main concept and the auxiliary concept
- m: main concept (exactly once)
- a: auxiliary concept (exactly once)
These codes are used to build strings, where each character corresponds to the parameter of the builder in the equivalent position. For example, consider the hyperedge:
The ma subpart indicates that the first concept following the builder should be considered a main concept, and the next one auxiliary. This defines an implicit taxonomical relation, in this case the "founder of psychoanalysis" is a type of "founder". In other words, a concept defined with a builder is a more specific type of the main concept.
Other examples: (of/B.ma capital/C france/C) is a type of capital/C and (for/B.ma data/C asia/C) is a type of data/C.
In the general case: (x/B.ma y/C z/C) is a subtype of y/C.
Subtypes¶
The following tables present the subtypes that SH semantic parsers are expected to produce.
Concept¶
| Code | Subtype | Example |
|---|---|---|
| Cc | common | apple/Cc |
| Cp | proper | mary/Cp |
| Ci | pronoun | she/Ci |
| Cq | quantitative | 27/Cn |
| Ca | adjective | 27/Ca |
| Cd | determinant | some/Cd |
| Cw | interrogative / wh (nominal) | who/Cw, what/Cw, which/Cw |
| Ce | demonstrative pronoun | this/Ce, that/Ce |
| Cg | nominalized verb / gerund | swimming/Cg |
| Cx | unclassified |
Predicate¶
Pv is the default declarative verbal predicate. The mood/kind subtypes below take precedence when they apply; a predicate carries exactly one single-character subtype.
| Code | Subtype | Example |
|---|---|---|
| Pv | verbal (declarative, default) | is/Pv |
| Pi | interrogative | is/Pi (Is the sky blue?) |
| Pj | imperative / jussive | close/Pj (Close the door) |
| Pn | nominal / copular | non-verbal / copula-drop predicate |
| Pe | existential | there is/are |
| Px | unclassified |
Builder¶
| Code | Subtype | Example |
|---|---|---|
| Bp | genitive / relational | of/Bp.ma (capital of France) |
| Bm | partitive / measure | cup of coffee, slice of bread |
| Bx | unclassified |
Appositives ("Obama, the president") are not builders: express them with the generic conjunction :/J/., e.g. (:/J/. obama/Cp (the/Md president/Cc)).
Modifier¶
| Code | Subtype | Example |
|---|---|---|
| Md | determinant | the/Md |
| Ma | adjective | green/Ma |
| Mq | quantitative | 100/Mq |
| Mm | modal / tense / auxiliary | will/Mm, was/Mm |
| Mb | adverbial / manner | quickly/Mb, carefully/Mb |
| Mg | degree / intensifier | very/Mg, more/Mg, too/Mg |
| Mn | negation | not/Mn |
| Mp | possessive | my/Mp |
| Me | demonstrative determiner | this/Me, that/Me |
| Mw | interrogative determiner | which/Mw, whose/Mw |
| Mx | unclassified |
Trigger¶
| Code | Subtype | Description | Example (EN) | Cross-lingual notes |
|---|---|---|---|---|
| Tt | temporal | situates event in time | when/Tt, after/Tt | DE als/wenn, PT quando, FR quand/lorsque |
| Tl | locative | situates event in space | in/Tl, where/Tl | covers static location & path; directionality disambiguated via Ts (source) vs Tl (goal) |
| Ti | indirect object | dative / recipient | to/Ti (gave to Mary) | DE dative, FR à, PT a |
| Ta | passive actor / agent | demoted agent | by/Ta | DE von/durch, PT por, FR par |
| Tb | beneficiary | for whose benefit | for/Tb (for him) | distinct from Tp (purpose); PT para is ambiguous Tb/Tp |
| Ts | source / ablative | origin, provenance | from/Ts, out of/Ts | DE aus/von, PT de, FR de/depuis |
| Tn | manner / means / instrument | how an action is performed | with/Tn (with a hammer), by/Tn (by force), carefully/Tn | conflates Latin ablative roles |
| Tw | comitative | accompaniment | with/Tw (with John) | |
| Tr | reference / topic | aboutness | about/Tr, regarding/Tr | JP は, DE über/bezüglich, PT sobre/acerca de |
| Tq | quantitative / measure | extent, degree, measure | by/Tq (by 5%), for/Tq (for 3 hours) | |
| Tv | privative / negative | absence, exclusion | without/Tv, lacking/Tv | DE ohne, PT sem, FR sans |
| Tf | conditional | hypothesis | if/Tf, unless/Tf | DE wenn/falls, PT se, FR si |
| Tc | causal | cause / reason | because/Tc, since/Tc | DE weil/da, PT porque, FR parce que |
| Tp | purpose / final | intended goal | to/Tp (in order to), so as to/Tp | distinct from To (result): purpose is intended, result is achieved |
| To | result / consecutive | achieved outcome | so that/To, such that/To | |
| Tg | concessive | counter-expectation | although/Tg, even though/Tg | DE obwohl, PT embora, FR bien que |
| Te | comparative | similarity / comparison | like/Te, as/Te, than/Te | |
| Td | declarative complementizer | introduces propositional argument | that/Td (I think that X) | optional in EN/PT, obligatory in DE dass, FR que |
| Tx | unclassified |
Special atoms¶
Special atoms are annotated with the reserved . namespace.
| Atom | Purpose | Example |
|---|---|---|
| +/B/. | Define compound nouns | (+/B.am/. alan/Cp turing/Cp) |
| :/J/. | Generic conjunction |
Special trigger atoms¶
There is one special trigger atom per trigger subtype. Use them to turn a bare concept or relation into a specifier (S) when a specification argument (the x role of a predicate) has no natural trigger word in the surface text. The subtype is chosen to match the semantic role the specification plays.
| Atom | Subtype | Example |
|---|---|---|
| _/Tt/. | temporal | (_/Tt/. monday/Cc) |
| _/Tl/. | locative | (_/Tl/. berlin/Cp) |
| _/Ti/. | indirect object | (_/Ti/. peter/Cp) |
| _/Ta/. | passive actor / agent | (_/Ta/. dog/Cc) |
| _/Tb/. | beneficiary | (_/Tb/. him/Ci) |
| _/Ts/. | source / ablative | (_/Ts/. paris/Cp) |
| _/Tn/. | manner / means / instrument | (_/Tn/. hammer/Cc) |
| _/Tw/. | comitative | (_/Tw/. john/Cp) |
| _/Tr/. | reference / topic | (_/Tr/. politics/Cc) |
| _/Tq/. | quantitative / measure | (_/Tq/. (5/Mq percent/Cc)) |
| _/Tv/. | privative / negative | (_/Tv/. money/Cc) |
| _/Tf/. | conditional | (_/Tf/. (rains/Pv.s it/Ci)) |
| _/Tc/. | causal | (_/Tc/. rain/Cc) |
| _/Tp/. | purpose / final | (_/Tp/. safety/Cc) |
| _/To/. | result / consecutive | (_/To/. victory/Cc) |
| _/Tg/. | concessive | (_/Tg/. rain/Cc) |
| _/Te/. | comparative | (_/Te/. lion/Cc) |
| _/Td/. | declarative complementizer | (_/Td/. (won/Pv.s she/Ci)) |
| _/Tx/. | unclassified | (_/Tx/. thing/Cc) |