Understanding Information Language

August 21, 2016

Understanding Information Language

Web 2.0

HTTP hyper text transfer protocol--the communication standard for delivering info on the web

HTML hyper text markup language is how the information is displayed and provides the browser with information on how to display text but it does not say anything about the meaning of the text (Hastings, 2015)

URL Uniform resource located and that's how we find a page on the web

API's application program interfaces let us pass information back and forth as we use web services like facebook

We have clicked on links since the beginning but data is locked up and we can't get access to it.

Web 3.0

URI Uniform resource identifier identifies individual "things" on the web or elsewhere (not web pages like the URL)

Four kinds of Metadata Standards (Schilling, 2012) defined by Zeng and Qin (2008)

Structures like the Dublin Core Metadata element set (DCMES), established the standard for the international standard (ISO, 2009)
Content like the Anglo American Cataloging rules (AACR2) or International Standard for Bibliographic Description (ISBD)
Values like the library of Congress subject headings (LCSH)
exchange like the MARC 21 format for bibliographic data (MARC 21)
Others

VRA Core
Categories for the description of works of art (CDWA)
Encoded archival description (EAD)
Art and Architecture Thesaurus (AAT)
Iconclass
Thesaurus of Graphic Materials (TGM)
Thesaurus of Geographical Names (TGN)
Resource Description and Access (RDA)
Metadata encoding and transmission standards (METS) was created by NISO in 2004

Data Exchange Standards

MARC 21--allow libraries to exchange metadata, inflexible output process
Extensible Markup Language (XML) allows users to design their own markup langagues to meet their own needs, encourages across platform sharing. Can identify that two items are related but not the how

RSS really simple syndication is an XML based format that allows republishing to ther sites or downloading by users

Standard Generalized Markup Language (SGML), XML is a subset of this. SGML is hierarchical and flexible

Metadata has three categories

defined as structured information that describes, explains, locates or makes it easier to retrieve, use or manage information (Rubin)

Descriptive- indentifies specifics, traditional library cataloging
Administrative--information to help manage a resource such as when and how it was created

technical (file characteristics)
rights management metadata (intellectual property rights)
preservation

Structural--indicates how compound objects are related; indicates the relationships between physical files and pages, between pages and chapters and chapters and book

Semantic Web Standards (W3C, see https://www.w3.org/Consortium/)

RDF Resource Description Framework--language for representing information about resources in the WWW. Represents Metadata about web resources (title, author, modification dates, copyright, licensing)

underpins the whole semantic web
includes "statements" and graphs
SPARQL is the query language used across diverse data sources
middleware allows other data to be viewed as RDF
HTTP protocol--retrieval mechanism

Vocabularies expressed in RDF

RDF Schema or RDF Vocabulary definition language (RDFS) provides mechanisms for describing groups of related resources and the relationships between these resources
Web Ontology Language (OWL) develops ontology evelopment and sharing via the Web with the goal of making web content more accessible to machines
Other vocabularies include FOAF, SIOC, SKOS, DOAP, vCard, Dublin Core, OAI_ORE or GoodRelations

Simple Knowledge Organization System (SKOS) provides a bridge between different communities of practice within the library and information sciences involved in the design and application of knowledge organization systems and between these communities and the semantic web. Helps to move systems to RDF
Berners-Lee four principles of linked data (Bizer, 2010)

Use URI's (Uniform Resource Identifiers) as names for things (from web pages, to objects in the real world to abstract concepts) to distinguish one thing from another (tells other computers where to look for data)

URI's allow for the creation of hyperlinked based data discovery and browsing
Library of Congress has started to give URI's to subject headings, names, genre terms, country codes, languages

Use HTTP URIs so that people can look up those names (anyone with a domain name can create URI references)

HTTP is a standardized access mechanism

When someone looks up at URI, provide useful information, using the standards (RDF, SPARQL)
Include links to other URI's so that they can discover more things

Linked data triple (OCLC video)

subject, relationship, objects
example: www.bbc.co.uk/nature/; schema.org, google knowledge graph (display on right side of search page is linked data)
dbpedia liked data wiki
by using linked data the programs talk to each other using triples

Heath and Bizer (2011) add that there are three types of links

relationship links
identity links
vocabulary links

Currently Libraries cannot talk to the web search engines because the languages are different
Search Engine Optimization (SE0) could provide a way for libraries to enhance their visibility through exposing their metadata as linked data
Why important for libraries? (OCLC video)

people can more easily find library resources on the web
more creative applications based on library metadata
opportunities for cataloging efficiency and innovation--looks at data like building blocks rather than whole building. Pieces can be removed or gathered piece by piece, they don't need to come in a full picture
doesn't it also

allow the computer to draw from more material
provide a richer interface to show more connections
allow users to contribute

VIAF

(Taken from Bizer 2010) By employing HTTP URIs to identify resources, the HTTP protocol as retrieval mechanism, and the RDF data model to represent resource descriptions, Linked Data directly builds on the general architecture of the Web (Jacobs & Walsh, 2004). The Web of Data can therefore be seen as an additional layer that is tightly interwoven with the classic document Web and has many of the same properties: • The Web of Data is generic and can contain any type of data. • Anyone can publish data to the Web of Data. • Data publishers are not constrained in choice of vocabularies with which to represent data. • Entities are connected by RDF links, creating a global data graph that spans data sources and enables the discovery of new data sources. From an application development perspective the Web of Data has the following characteristics: • Data is strictly separated from formatting and presentational aspects. • Data is self-describing. If an application consuming Linked Data encounters data described with an unfamiliar vocabulary, the application can dereference the URIs that identify vocabulary terms in order to find their definition. • The use of HTTP as a standardized data access mechanism and RDF as a standardized data model simplifies data access compared to Web APIs, which rely on heterogeneous data models and access interfaces. • The Web of Data is open, meaning that applications do not have to be implemented against a fixed set of data sources, but can discover new data sources at run-time by following RDF links

Search This Blog

Making Meaning: MVcommonplacebook

Understanding Information Language

Comments

Post a Comment

Popular Posts

Radical Information Science Infographic Link repost

Photons vs gravitational waves