Influencing policy (training slides from Fast Track Impact)
Semantic technologies for the Internet of Things
1. 1
Semantic technologies for the Internet of
Things
Payam Barnaghi
Institute for Communication Systems (ICS)
University of Surrey
Guildford, United Kingdom
International “IoT 360″ Summer School
October 29th – November 1st, 2014 – Rome, Italy
2. 2
Things, Data, and lots of it
image courtesy: Smarter Data - I.03_C by Gwen Vanhee
3. Data in the IoT
− Data is collected by sensory devices and also crowd
sensing sources.
− It is time and location dependent.
− It can be noisy and the quality can vary.
− It is often continuous - streaming data.
− There are other important issues such as:
− Device/network management
− Actuation and feedback (command and control)
− Service and entity descriptions are also important.
4. 4
“Raw data is both an oxymoron and
bad data”
Geoff Bowker, 2005
Source: Kate Crawford, "Algorithmic Illusions: Hidden Biases of Big Data", Strata 2013.
5. 5
From data to actionable information
Wisdom?
Knowledge
Information
Data
Actionable information
Abstractions and perceptions
Structured data (with semantics)
Raw sensory data
7. Semantics and Data
− Data with semantic annotations
− Provenance, quality of information
− Interpretable formats
− Links and interconnections
− Background knowledge, domain information
− Hypotheses, expert knowledge
− Adaptable and context-aware solutions
7
8. Interoperable and Semantically described
Data is the starting point to create an
efficient set of Actions.
The goal is often to create actionable
information.
9. Wireless Sensor (and Actuator)
Networks
Inference/
Processing
of IoT data
Core network
“Web of Things”
Gateway e.g. Internet
Protocols?
Data
Aggregation/
Fusion
Sink
node Gateway
End-user
Interoperable/
Computer services
Operating
Systems?
Services?
Protocols?
In-node
Data
Processing
Interoperable/
Machine-interpretable
representations
Interoperable/
Machine-interpretable
Representations?
- The networks typically run Low Power Devices
- Consist of one or more sensors, could be different type of sensors (or actuators)
Machine-interpretable
representations
10. 10
What we are going to study
− The sensors (and in general “Things”) are increasingly being
connected with Web infrastructure.
− This can be supported by embedded devices that directly support
IP and web-based connection (e.g. 6LowPAN and CoAp) or devices
that are connected via gateway components.
− Broadening the IoT to the concept of “Web of Things”
− There are already standards such as Sensor Web Enablement
(SWE) set developed by the Open Geospatial Consortium (OGC)
that are widely being adopted in industry, government and
academia.
− While such frameworks provide some interoperability, semantic
technologies are increasingly seen as key enabler for integration
of IoT data and broader Web information systems.
11. Data formats
11
Observation and measurement data-annotation
Tags
Location
Source: Cosm.com
12. Observation and measurement data
15, C, 08:15, 51.243057, -0.589444
12
value
Unit of
measurement
Time
Longitude
Latitude
How to make the data representations more machine-readable
and machine-interpretable;
13. Observation and measurement data
15, C, 08:15, 51.243057, -0.589444
13
<value>
<unit>
<Time>
<Longitude>
<Latitude>
What about this?
<value>15</value>
<unit>C</unit>
<time>08:15</time>
<longitude>51.243057</longitude>
<latitude>-0.58944</latitude>
14. Extensible Markup Language (XML)
− XML is a simple, flexible text format that is used
for data representation and annotation.
− XML was originally designed for large-scale
electronic publishing.
− XML plays a key role in the exchange of a wide
variety of data on the Web and elsewhere.
− It is one of the most widely-used formats for
sharing structured information.
14
15. XML Document Example
<?xml version="1.0"?>
<measurement>
<value>15</value>
<unit>C</unit>
<time>08:15</time>
<longitude>51.243057</longitude>
<latitude>-0.58944</latitude>
</measurement>
15
XML Prolog- the XML
declaration
XML
elements
XML documents
MUST be “well
formed”
Root element
17. Well Formed XML Documents
− A "Well Formed" XML document has correct XML
syntax.
− XML documents must have a root element
− XML elements must have a closing tag
− XML tags are case sensitive
− XML elements must be properly nested
− XML attribute values must be quoted
Source: W3C Schools, http://www.w3schools.com/ 17
18. Validating XML Documents
− A "Valid" XML document is a "Well Formed" XML
document, which conforms to the structure of the
document defined in an XML Schema.
− XML Schema defines the structure and a list of
defined elements for an XML document.
18
19. XML Schema- example
<xs:element name=“measurement">
<xs:complexType>
<xs:sequence>
<xs:element name=“value" type="xs:decimal"/>
<xs:element name=“unit" type="xs:string"/>
<xs:element name=“time" type="xs:time"/>
<xs:element name=“longitude" type="xs:double"/>
<xs:element name=“latitude" type="xs:double"/>
</xs:sequence>
</xs:complexType>
</xs:element>
19
- XML Schema defines the structure and elements
- An XML document then becomes an instantiation of the document defined
by the schema;
20. XML Documents–
revisiting the example
<?xml version="1.0"?>
<measurement>
<value>15</value>
<unit>C</unit>
<time>08:15</time>
<longitude>51.243057</longitude>
<latitude>-0.58944</latitude>
</measurement>
20
<?xml version="1.0"?> “But what about this?”
<sensor_data>
<reading>15</reading>
<u>C</u>
<timestamp>08:15</timestamp>
<long>51.243057</long>
<lat>-0.58944</lat>
</sensor_data>
21. 21
XML
− Meaning of XML-Documents is intuitively clear
− due to "semantic" Mark-Up
− tags are domain-terms
− But, computers do not have intuition
− tag-names do not provide semantics for machines.
− DTDs or XML Schema specify the structure of
documents, not the meaning of the document
contents
− XML lacks a semantic model
− has only a "surface model”, i.e. tree
Source: Semantic Web, John Davies, BT, 2003.
22. XML:
limitations for semantic markup
− XML representation makes no commitment on:
− Domain specific ontological vocabulary
−Which words shall we use to describe a given set of concepts?
− Ontological modelling primitives
−How can we combine these concepts, e.g. “car is a-kind-of
(subclass-of) vehicle”
requires pre-arranged agreement on
vocabulary and primitives
Only feasible for closed collaboration
agents in a small & stable community
pages on a small & stable intranet
.. not for sharable Web-resources
Source: Semantic Web, John Davies, BT, 2003. 22
23. Semantic Web technologies
− XML provide a metadata format.
− It defines the elements but does not provide
any modelling primitive nor describes the
meaningful relations between different
elements.
− Using semantic technologies to solve these
issues.
23
24. A bit of history
− “The Semantic Web is an extension of the current web
in which information is given well-defined meaning,
better enabling computers and people to work in co-operation.“
(Tim Berners-Lee et al, 2001)
24
Image source: Miller 2004
25. Semantics & the IoT
− The Semantic Sensor (&Actuator) Web is an extension
of the current Web/Internet in which information is given
well-defined meaning, better enabling objects, devices
and people to work in co-operation and to also enable
autonomous interactions between devices and/or
objects.
25
26. Resource Description
Framework (RDF)
− A W3C standard
− Relationships between documents
− Consisting of triples or sentences:
− <subject, property, object>
− <“Sensor”, hasType, “Temperature”>
− <“Node01”, hasLocation, “Room_BA_01” >
− RDFS extends RDF with standard “ontology
vocabulary”:
− Class, Property
− Type, subClassOf
− domain, range
26
27. RDF for semantic annotation
− RDF provides metadata about resources
− Object -> Attribute-> Value triples or
− Object -> Property-> Subject
− It can be represented in XML
− The RDF triples form a graph
27
31. Let’s add a bit more structure
(complexity?)
31
xsd:decimal
Location
hasValue
hasTime
xsd:double
xsd:time
xsd:double
xsd:string
hasLongitude
hasLatitude
hasUnit
Measurement
hasLocation
32. An instance of our model
32
15
Location
#0126
hasValue
hasTime
51.243057
08:15
-0.589444
C
hasLongitude
hasLatitude
hasUnit
Measurement
#0001
hasLocation
33. RDF: Basic Ideas
−Resources
−Every resource has a URI (Universal Resource
Identifier)
−A URI can be a URL (a web address) or a some
other kind of identifier;
−An identifier does not necessarily enable
access to a resources
−We can think of a resources as an object that
we want to describe it.
−Car
−Person
−Places, etc.
33
34. RDF: Basic Ideas
− Properties
− Properties are special kind of resources;
− Properties describe relations between resources.
− For example: “hasLocation”, “hasType”, “hasID”,
“sratTime”, “deviceID”,.
− Properties in RDF are also identified by URIs.
− This provides a global, unique naming scheme.
− For example:
−“hasLocation” can be defined as:
− URI: http://www.loanr.it/ontologies/DUL.owl#hasLocation
− SPARQL is a query language for the RDF data.
−SPARQL provide capabilities to query RDF graph patterns
along with their conjunctions and disjunctions.
34
35. Ontologies
− The term ontology is originated from philosophy.
In that context it is used as the name of a
subfield of philosophy, namely, the study of the
nature of existence.
− In the Semantic Web:
− An ontology is a formal specification of a domain;
concepts in a domain and relationships between the
concepts (and some logical restrictions).
35
36. Ontologies and Semantic Web
− In general, an ontology describes a set of
concepts in a domain.
− An ontology consists of a finite list of terms and
the relationships between the terms.
− The terms denote important concepts (classes of
objects) of the domain.
− For example, in a university setting, staff
members, students, courses, modules, lecture
theatres, and schools are some important
concepts.
36
37. Web Ontology Language (OWL)
− RDF(S) is useful to describe the concepts and their
relationships, but does not solve all possible requirements
− Complex applications may want more possibilities:
− similarity and/or differences of terms (properties or classes)
− construct classes, not just name them
− can a program reason about some terms? e.g.:
− each «Sensor» resource «A» has at least one «hasLocation»
− each «Sensor» resource «A» has maximum one ID
− This lead to the development of Web Ontology Language or
OWL.
37
38. OWL
− OWL provide more concepts to express meaning
and semantics than XML and RDF(S)
− OWL provides more constructs for stating logical
expressions such as: Equality, Property
Characteristics, Property Restrictions, Restricted
Cardinality, Class Intersection, Annotation
Properties, Versioning, etc.
Source: http://www.w3.org/TR/owl-features/ 38
39. Ontology engineering
− An ontology: classes and properties (also referred
to as schema ontology)
− Knowledge base: a set of individual instances of
classes and their relationships
− Steps for developing an ontology:
− defining classes in the ontology and arranging the
classes in a taxonomic (subclass–superclass) hierarchy
− defining properties and describing allowed values and
restriction for these properties
− Adding instances and individuals
40. Basic rules for designing ontologies
− There is no one correct way to model a domain;
there are always possible alternatives.
− The best solution almost always depends on the
application that you have in mind and the required
scope and details.
− Ontology development is an iterative process.
− The ontologies provide a sharable and extensible form to
represent a domain model.
− Concepts that you choose in an ontology should
be close to physical or logical objects and
relationships in your domain of interest (using
meaningful nouns and verbs).
41. A simple methodology
1. Determine the domain and scope of the model that you want to
design your ontology.
2. Consider reusing existing concepts/ontologies; this will help to
increase the interoperability of your ontology.
3. Enumerate important terms in the ontology; this will determine
what are the key concepts that need to be defined in an ontology.
4. Define the classes and the class hierarchy; decide on the classes
and the parent/child relationships
5. Define the properties of classes; define the properties that relate
the classes;
6. Define features of the properties; if you are going to add
restriction or other OWL type restrictions/logical expressions.
7. Define/add instances
41
42. Semantic technologies in the IoT
− Applying semantic technologies to IoT can
support:
− Interoperability
− effective data access and integration
− resource discovery
− reasoning and processing of data
− knowledge extraction (for automated decision making
and management)
42
43. 43
Data/Service description frameworks
− There are standards such as Sensor Web Enablement
(SWE) set developed by the Open Geospatial Consortium
that are widely being adopted in industry, government and
academia.
− While such frameworks provide some interoperability,
semantic technologies are increasingly seen as key enabler
for integration of IoT data and broader Web information
systems.
44. Revisiting goals of the
Internet of Things
− A primary goal of interconnecting devices and
collecting/processing data from them is to create
situation awareness and enable applications,
machines, and human users to better understand
their surrounding environments.
− The understanding of a situation, or context,
potentially enables services and applications to
make intelligent decisions and to respond to the
dynamics of their environments.
44
45. 45
Sensor Markup Language (SensorML)
Source: http://www.mitre.org/
The Sensor Model
Language Encoding
(SensorML) defines
models and XML
encoding to represent the
geometric, dynamic, and
observational
characteristics of sensors
and sensor systems.
46. Using semantics
− Find all available resources (which can provide
data) and data related to “Room A” (which is an
object in the linked data)?
− What is “Room A”? What is its location? returns “location”
data
− What type of data is available for “Room A” or that “location”?
(sensor types)
− Predefined Rules can be applied based on
available data
− (TempRoom_A > 80°C) AND (SmokeDetectedRoom_A position==TRUE)
FireEventRoom_A
46
47. Semantic modelling
− Lightweight: experiences show that a lightweight
ontology model that well balances expressiveness
and inference complexity is more likely to be
widely adopted and reused; also large number of
IoT resources and huge amount of data need
efficient processing
− Compatibility: an ontology needs to be consistent
with those well designed, existing ontologies to
ensure compatibility wherever possible.
− Modularity: modular approach to facilitate
ontology evolution, extension and integration
with external ontologies.
47
48. Existing models- SSN Ontology
− W3C Semantic Sensor Network Incubator Group’s
SSN ontology (mainly for sensors and sensor
networks, platforms and systems).
http://www.w3.org/2005/Incubator/ssn/
49. Stimulus-Sensor-Observation
- The SSO Ontology Design Pattern developed
following the principle of minimal ontological
commitments to make it reusable for a variety of
application areas.
-Introduces a minimal set of classes and relations
centered around the notions of stimuli, sensor, and
observations.
-Defines stimuli as the (only) link to the physical
environment.
49
52. 52
SSN Ontology
Ontology Link: http://www.w3.org/2005/Incubator/ssn/ssnx/ssn
M. Compton et al, "The SSN Ontology of the W3C Semantic Sensor Network Incubator Group", Journal of Web Semantics, 2012.
53. 53
53
W3C SSN Ontology
makes observations
of this type
What it
measures
Where it is
units
SSN-XG ontologies
SSN-XG annotations
SSN-XG Ontology Scope
54. What SSN does not model
− Sensor types and models
− Networks: communication, topology
− Representation of data and units of measurement
− Location, mobility or other dynamic behaviours
− Control and actuation
− ….
54
55. Web of Things
− Integrating the real world data
into the Web and providing
Web-based interactions with
the IoT resources is also often
discussed under umbrella term
of “Web of Things” (WoT).
− WoT data is not only large in
scale and volume, but also
continuous, with rich
spatiotemporal dependency.
55
56. Web of Things
Connecting sensor, actuator and other devices to the World
Wide Web.
“Things’ data and capabilities are exposed as web
data/services.
Enables an interoperable usage of IoT resources (e.g.
sensors, devices, their data and capabilities) by enabling
web based discovery, access, tasking, and alerting.
56
58. 58
The world of IoT and Semantics:
Challenges and issues
59. 59
Some good existing models:
SSN Ontology
Ontology Link: http://www.w3.org/2005/Incubator/ssn/ssnx/ssn
M. Compton et al, "The SSN Ontology of the W3C Semantic Sensor Network Incubator Group", Journal of Web Semantics, 2012.
60. Semantic Sensor Web
60
“The semantic sensor Web enables
interoperability and advanced analytics
for situation awareness and other
advanced applications from
heterogeneous sensors.”
(Amit Sheth et al, 2008)
62. 62
We have good models and description
frameworks;
The problem is that having good
models and developing ontologies is
not enough.
63. 63
Semantic descriptions are intermediary
solutions, not the end product.
They should be transparent to the end-user
and probably to the data producer
as well.
64. A WoT/IoT Framework
WSN
WSN
WSN
WSN
WSN
Network-enabled
Devices
Semantically
annotate data
64
Gateway
CoAP
HTTP
CoAP
CoAP
HTTP
6LowPAN
Semantically
annotate data
http://mynet1/snodeA23/readTemp?
WSN
MQTT
MQTT
Gateway
And several other
protocols and solutions…
65. Publishing Semantic annotations
− We need a model (ontology) – this is often the easy part
for a single application.
− Interoperability between the models is a big issue.
− Express-ability vs Complexity is a challenge
− How and where to add the semantics
− Where to publish and store them
− Semantic descriptions for data, streams, devices
(resources) and entities that are represented by the
devices, and description of the services.
65
67. Hyper/CAT
- Servers provide catalogues of resources to
clients.
- A catalogue is an array of URIs.
- Each resource in the catalogue is annotated
with metadata (RDF-like triples).
67 Source: Toby Jaffey, HyperCat Consortium, http://www.hypercat.io/standard.html
68. Hyper/CAT model
68 Source: Toby Jaffey, HyperCat Consortium, http://www.hypercat.io/standard.html
69. 69
Complex models are (sometimes) good
for publishing research papers….
But they are often difficult to
implement and use in real world
products.
70. What happens afterwards is more important
− How to index and query the annotated data
− How to make the publication suitable for constrained
environments and/or allow them to scale
− How to query them (considering the fact that here we are
dealing with live data and often reducing the processing
time and latency is crucial)
− Linking to other sources
70
71. The IoT is a dynamic, online and rapidly
changing world
71
isPartOf
Annotation for the (Semantic) Web
Annotation for the IoT
Image sources: ABC Australia and 2dolphins.com
73. 73
Creating common vocabularies and
taxonomies are also equally important
e.g. event taxonomies.
74. 74
We should accept the fact that
sometimes we do not need (full)
semantic descriptions.
Think of the applications and use-cases
before starting to annotate the data.
75. 75
Semantic descriptions can be fairly
static on the Web;
In the IoT, the meaning of data and
the annotations can change over
time/space…
77. Dynamic Semantics
<iot:measurement>
<iot:type> temp</iot:type>
<iot:unit>Celsius</iot:unit>
<time>12:30:23UTC</time>
<iot:accuracy>80%</iot:accuracy>
<loc:long>51.2365<loc:lat>
<loc:lat>0.5703</loc:lat>
</iot:measurment>
- But this could be also a
function of time and
location;
- What would be the
accuracy 5 seconds after
the measurement?
- Should it be a part of this
model?
77
78. Dynamic annotations for data in the
process chain
S. Kolozali et al, A Knowledge-based Approach for Real-Time IoT Data Stream Annotation and Processing", iThings 2014, 2014. 78
79. Dynamic annotations for provenance data
S. Kolozali et al, A Knowledge-based Approach for Real-Time IoT Data Stream Annotation and Processing", iThings 2014, 2014. 79
81. Extraction of events and semantics from social media
81
Tweets from a city
City Infrastructure
https://osf.io/b4q2t/
P. Anantharam, P. Barnaghi, K. Thirunarayan, A. Sheth, "Extracting city events from social streams,“, 2014.
83. Overall, we need semantic technologies
in the IoT and these play a key role in
providing interoperability.
84. However, we should design and use
the semantics carefully and
consider the constraints and
dynamicity of the IoT environments.
85. #1: Design for large-scale and provide tools and
APIs.
#2: Think of who will use the semantics and how
when you design your models.
#3: Provide means to update and change the
semantic annotations.
85
86. #4: Create tools for validation and interoperability
testing.
#5: Create taxonomies and vocabularies.
#6: Of course you can always create a better
model, but try to re-use existing ones as much as
you can.
86
87. #7: Link your data and descriptions to other
existing resources.
#8: Define rules and/or best practices for providing
the values for each attribute.
#9: Remember the widely used semantic
descriptions on the Web are simple ones like
FOAF.
87
88. #10: Semantics are only one part of the solution
and often not the end-product so the focus of the
design should be on creating effective methods,
tools and APIs to handle and process the
semantics.
Query methods, machine learning, reasoning and
data analysis techniques and methods should be
able to effectively use these semantics.
88
89. Data analytics framework
Ambient
Intelligence
Social
systems Interactions Interactions
89
Data Data
Data:
Domain
Knowledge
Domain
Knowledge
Social
systems
Open
Interfaces
Open
Interfaces
Ambient
Intelligence
Quality and
Quality and
Trust
Trust
Privacy and
Security
Privacy and
Security
Open Data Open Data
91. IoT data: semantic related issues
− The current IoT data communications often rely on binary or syntactic data
models which lack of providing machine interpretable meanings to the
data.
− Syntactic representation or in some cases XML-based data
− Often no general agreement on annotating the data
− requires a pre-agreement between different parties to be able to
process and interpret the data
− Limited reasoning based on the content and context data
− Limited interoperability in data and resource/device description level
− Data integration and fusion issues
92. Requirements
− Structured representation of concepts
− Machine-interpretable descriptions
− Reasoning mechanisms
− Access mechanism to heterogeneous resource descriptions with
diverse capabilities
− Automated interactions and horizontal integration with existing
applications
93. What are the challenges?
− The models provide the basic description frameworks, but
alignment between different models and frameworks are required.
− Semantics are the starting point, reasoning and interpretation of
data is required for automated processes.
− Real interoperability happens when data/services from different
frameworks and providers can be interchanged and used with
minimised intervention.
94. Possible solutions?
− The semantic Web has faced this problem earlier.
− Proposed solution: using machine-readable and machine-interpretable
meta-data
− Important not: machine-interpretable but not machine-untreatable!
− Well defined standards and description frameworks: RDF, OWL, SPARQL
− Variety of open-source, commercial tools for creating/managing/querying
and accessing semantic data
− Jena, Sesame, Protégé, …
− An Ontology defines conceptualisation of a domain.
− Terms and concepts
− A common vocabulary
− Relationships between the concepts
− There are several existing and emerging ontologies in the IoT domain.
− HyperCat model
− W3C SSN ontology
− And many more
− Automated annotation methods, dynamic semantics
95. How to adapt the solutions?
− Creating ontologies and defining data models are not enough
− tools to create and annotate data
− data handling components
− Complex models and ontologies look good, but
− design lightweight versions for constrained environments
− think of practical issues
− make it as much as possible compatible and/or link it to the other
existing ontologies
− Domain knowledge and instances
− Common terms and vocabularies
− Location, unit of measurement, type, theme, …
− Link it to other resource
− Linked-data
− URIs and naming
− In many cases, semantic annotations and semantic processing
should be intermediary not the end products.
96. What are the practical steps?
− Linked data approach is a promising way of integrating data from
different sources and interlinking semantic descriptions;
− Alignment between different description models for
Services/Resources/Entities;
− Using common models (e.g. HyperCat, SSNO) and developing
applications and services that use these information represented
based on the models;
− Ontology learning from real world data;
− Dynamic and automated annotations;
− Semantic processing, scalable (distributed) repository, discovery,
query and analysis support;
− Tools and support for real-time and streaming (semantically
annotated) data;
97. Quiz
− Design a simple ontology (model) to describe
operating system and different sensors on a
smart phone.
98. Q&A
− Payam Barnaghi, University of
Surrey/EU FP7 CityPulse Project
http://www.ict-citypulse.eu/
@pbarnaghi
p.barnaghi@surrey.ac.uk