Friday, 24 of October of 2014

Beginner’s glossary to linked data

This is a beginner’s glossary to linked data. It is a part of the yet-to-be published LiAM Guidebook on linked data in archives.

  • API – (see application programmer interface)
  • application programmer interface (API) – an abstracted set of functions and commands used to get output from remote computer applications. These functions and commands are not necessarily tied to any specific programming language and therefore allow programmers to use a programming language of their choice.
  • content negotiation – a process whereby a user-agent and HTTP server mutually decide what data format will be exchanged during an HTTP request. In the world of linked data, content negotiation is very important when URIs are requested by user-agents because content negotiation helps determine whether or not HTML or serialized RDF will be returned.
  • extensible markup language (XML) – a standardized data structure made up of a minimum of rules and can be easily used to represent everything from tiny bits of data to long narrative texts. XML is designed to be read my people as well as computers, but because of this it is often considered verbose, and ironically, difficult to read.
  • HTML – (see hypertext markup language)
  • HTTP – (see hypertext transfer protocol)
  • hypertext markup language (HTML) – an XML-like data structure intended to be rendered by user-agents whose output is for people to read. For the most part, HTML is used to markup text and denote a text’s stylistic characteristics such as headers, paragraphs, and list items. It is also used do markup the hypertext links (URLs) between documents.
  • hypertext transfer protocol (HTTP) – the formal name for the way the World Wide Web operates. It begins with one computer program (a user-agent) requesting content from another computer program (a server) and getting back a response. Once received, the response is formatted for reading by a person or for processing by a computer program. The shape and content of both the request and the response are what make-up the protocol.
  • Javascript object notation (JSON) – like XML, a data structure enabling allowing arbitrarily large sets of values to associated with an arbitrarily large set of names (variables). JSON was first natively implemented as a part of the Javascript language, but has since become popular in other computer languages.
  • JSON – (see Javascript object notation)
  • linked data – the stuff and technical process for making real the ideas behind the Semantic Web. It begins with the creation of serialized RDF and making the serialization available via HTTP. User agents are then expected to harvest the RDF, combine it with other harvested RDF, and ideally use it to bring to light new or existing relationships between real world objects — people, places, and things — thus creating and enhancing human knowledge.
  • linked open data – a qualification of linked data whereby the information being exchanged is expected to be “free” as in gratis.
  • ontology – a highly structured vocabulary, and in the parlance of linked data, used to denote, describe, and qualify the predicates of RDF triples. Ontologies have been defined for a very wide range of human domains, everything from bibliography (Dublin Core or MODS), to people (FOAF), to sounds (Audio Features).
  • RDF – (see resource description framework)
  • representational state transfer (REST) – a process for querying remote HTTP servers and getting back computer-readable results. The process usually employs denoting name-value pairs in a URL and getting back something like XML or JSON.
  • resource description framework – the conceptual model for describing the knowledge of the Semantic Web. It is rooted in the notion of triples whose subjects and objects are literally linked with other triples through the use of actionable URIs.
  • REST – (see representational state transfer)
  • Semantic Web – an idea articulated by Tim Berners Lee whereby human knowledge is expressed in a computer-readable fashion and made available via HTTP so computers can harvest it and bring to light new information or knowledge.
  • serialization – a manifestation of RDF; one of any number of textual expressions of RDF triples. Examples include but are not limited to RDF/XML, RDFa, N3, and JSON-LD.
  • SPARQL – (see SPARQL protocol and RDF query language)
  • SPARQL protocol and RDF query language (SPARQL) – a formal specification for querying and returning results from RDF triple stores. It looks and operates very much like the structured query language (SQL) of relational databases complete with its SELECT, WHERE, and ORDER BY clauses.
  • triple – the atomistic facts making up RDF. Each fact is akin to a rudimentary sentence with three parts: 1) subject, 2) predicate, and 3) object. Subjects are expected to be URIs. Ideally, objects are URIs as well, but can also be literals (words, phrases, or numbers). Predicates are akin to the verbs in a sentence and they denote a relationship between the subject and object. Predicates are expected to be a member of a formalized ontology.
  • triple store – a database of RDF triples usually accessible via SPARQL
  • universal resource identifier (URI) – a unique pointer to a real-world object or a description of an object. In the parlance of linked data, URIs are expected to have the same shape and function as URLs, and if they do, then the URIs are often described as “actionable”.
  • universal resource locator (URL) – an address denoting the location of something on the Internet. These addresses usually specify a protocol (like http), a host (or computer) where the protocol is implemented, and a path (directory and file) specifying where on the computer the item of interest resides.
  • URI – (see universal resource identifier)
  • URL – (see universal resource locator)
  • user agent – this is the formal name for what is commonly called a “Web browser”, but Web browsers usually denote applications where people are viewing the results. User agents are usually “Web browsers” whose readers are computer programs.
  • XML – (see extensible markup language)

For a more complete and exhaustive glossary, see the W3C’s Linked Data Glossary.


Leave a comment


Comments RSS TrackBack 1 comment