Understanding Information Language
Web
2.0
HTTP hyper text transfer protocol--the
communication standard for delivering info on the web
HTML hyper text markup language is how the
information is displayed and provides the browser with information on how to
display text but it does not say anything about the meaning of the text
(Hastings, 2015)
URL Uniform resource located and that's how we
find a page on the web
API's application program interfaces let us
pass information back and forth as we use web services like facebook
We
have clicked on links since the beginning but data is locked up and we can't
get access to it.
Web
3.0
URI Uniform resource identifier identifies
individual "things" on the web or elsewhere (not web pages like the
URL)
Four
kinds of Metadata Standards (Schilling,
2012) defined by Zeng and Qin (2008)
- Structures like the Dublin
Core Metadata element set (DCMES), established the standard for the
international standard (ISO, 2009)
- Content like the Anglo American Cataloging rules (AACR2) or International Standard for Bibliographic Description (ISBD)
- Values like the library of Congress subject headings (LCSH)
- exchange like the MARC 21
format for bibliographic data (MARC 21)
- Others
- VRA Core
- Categories for the description of works of art (CDWA)
- Encoded archival description (EAD)
- Art and Architecture Thesaurus (AAT)
- Iconclass
- Thesaurus of Graphic Materials (TGM)
- Thesaurus of Geographical Names (TGN)
- Resource Description and Access (RDA)
- Metadata encoding and transmission standards (METS) was created by NISO in 2004
Data Exchange Standards
- MARC 21--allow libraries to exchange metadata, inflexible output process
- Extensible Markup Language (XML) allows users to design their own markup langagues to meet their own needs, encourages across platform sharing. Can identify that two items are related but not the how
- RSS really simple syndication is an XML based format that allows republishing to ther sites or downloading by users
- Standard Generalized Markup Language (SGML), XML is a subset of this. SGML is hierarchical and flexible
Metadata has three categories
defined
as structured information that describes, explains, locates or makes it easier
to retrieve, use or manage information (Rubin)
- Descriptive- indentifies specifics, traditional library cataloging
- Administrative--information to help manage a resource such as when and how it was created
- technical (file characteristics)
- rights management metadata (intellectual property rights)
- preservation
- Structural--indicates how compound objects are related; indicates the relationships between physical files and pages, between pages and chapters and chapters and book
Semantic Web Standards (W3C, see https://www.w3.org/Consortium/)
- RDF Resource Description Framework--language for representing information about resources in the WWW. Represents Metadata about web resources (title, author, modification dates, copyright, licensing)
- underpins the whole semantic web
- includes "statements" and graphs
- SPARQL is the query language used across diverse data sources
- middleware allows other data to be viewed as RDF
- HTTP protocol--retrieval mechanism
- Vocabularies expressed in RDF
- RDF Schema or RDF Vocabulary definition language (RDFS) provides mechanisms for describing groups of related resources and the relationships between these resources
- Web Ontology Language (OWL) develops ontology evelopment and sharing via the Web with the goal of making web content more accessible to machines
- Other vocabularies include FOAF, SIOC, SKOS, DOAP, vCard, Dublin Core, OAI_ORE or GoodRelations
- Simple Knowledge Organization System (SKOS) provides a bridge between different communities of practice within the library and information sciences involved in the design and application of knowledge organization systems and between these communities and the semantic web. Helps to move systems to RDF
- Berners-Lee four principles of linked data (Bizer, 2010)
- Use URI's (Uniform Resource Identifiers) as names for things (from web pages, to objects in the real world to abstract concepts) to distinguish one thing from another (tells other computers where to look for data)
- URI's allow for the creation of hyperlinked based data discovery and browsing
- Library of Congress has started to give URI's to subject headings, names, genre terms, country codes, languages
- Use HTTP URIs so that people can look up those names (anyone with a domain name can create URI references)
- HTTP is a standardized access mechanism
- When someone looks up at URI, provide useful information, using the standards (RDF, SPARQL)
- Include links to other URI's so that they can discover more things
- Linked data triple (OCLC video)
- subject, relationship, objects
- example: www.bbc.co.uk/nature/; schema.org, google knowledge graph (display on right side of search page is linked data)
- dbpedia liked data wiki
- by using linked data the programs talk to each other using triples
- Heath and Bizer (2011) add that there are three types of links
- relationship links
- identity links
- vocabulary links
- Currently Libraries cannot talk to the web search engines because the languages are different
- Search Engine Optimization (SE0) could provide a way for libraries to enhance their visibility through exposing their metadata as linked data
- Why important for libraries? (OCLC video)
- people can more easily find library resources on the web
- more creative applications based on library metadata
- opportunities for cataloging efficiency and innovation--looks at data like building blocks rather than whole building. Pieces can be removed or gathered piece by piece, they don't need to come in a full picture
- doesn't it also
- allow the computer to draw from more material
- provide a richer interface to show more connections
- allow users to contribute
- VIAF
(Taken
from Bizer 2010) By employing HTTP URIs to identify resources, the HTTP
protocol as retrieval mechanism, and the RDF data model to represent resource
descriptions, Linked Data directly builds on the general architecture of the
Web (Jacobs & Walsh, 2004). The Web of Data can therefore be seen as an
additional layer that is tightly interwoven with the classic document Web and
has many of the same properties: • The Web of Data is generic and can contain
any type of data. • Anyone can publish data to the Web of Data. • Data
publishers are not constrained in choice of vocabularies with which to
represent data. • Entities are connected by RDF links, creating a global data
graph that spans data sources and enables the discovery of new data sources.
From an application development perspective the Web of Data has the following
characteristics: • Data is strictly separated from formatting and
presentational aspects. • Data is self-describing. If an application consuming
Linked Data encounters data described with an unfamiliar vocabulary, the
application can dereference the URIs that identify vocabulary terms in order to
find their definition. • The use of HTTP as a standardized data access
mechanism and RDF as a standardized data model simplifies data access compared
to Web APIs, which rely on heterogeneous data models and access interfaces. •
The Web of Data is open, meaning that applications do not have to be
implemented against a fixed set of data sources, but can discover new data
sources at run-time by following RDF links
Comments
Post a Comment