Modelling an OWL Ontology from Useragent Strings

For one of the assignments of the Semantic Web module of my MSc, I was required to model an OWL ontology.

I opted to model the ontology using real-world data within the e-commerce domain, deriving the taxonomy from useragent strings saved at time of order.

The useragent strings of 50 000 orders were analysed.

The assignment requirements are below. Answers to selected questions are shown as well:

  • Part 1: Establishing the domain, determining the competency questions and creating the taxonomy
  • Part 2: Modelling the concepts based on the taxonomy in Part 1
  • Part 3: Writing SPARQL queries to answer the competency questions in Part 1
  • Part 4: Resources
1 Select a domain which you want to model (e.g., history, geography, architecture).

The selected domain is devices with web-browsing capabilities.

2 Write a brief (20-25 lines) scenario description for your choice of domain that reflects your choice of classes and properties.
You must keep a specific application in context.
Provide a justification of why you chose the domain.

When a user browses to a website (be it on their mobile phone, tablet, etc), the useragent string is included in the browser headers, and websites are able to access this data. The useragent strings can be analysed on the fly by using a device detection service (such as WURFL or 51Degrees), or by extracting the useragent strings and saving them for later analysis. This data can then be used to get more of an understanding of the type of devices being used and their capabilities, thus enabling us to provide customised interfaces that do not hinder conversion (important for an e-commerce website).

This is an example of a useragent string:

Mozilla/5.0 (Linux; Android 4.4.2; MNPH717M Build/KOT49H) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/ Mobile Safari/537.36

I have a smartphone, and when I browse a website, the website is able to identify, by analysing the useragent string, that I am on an Apple device, it is an iPhone 5, with iOS 8.x as the operating system, and I'm using Safari as the web browser. The website would also be able to identify that my device is touchscreen-enabled, thus enabling it to adapt its display layer to features more suited to a touchscreen device, which is a very different user experience to a device that relies on a mouse.

3 Identify five competency questions for your domain.
  • Which devices are capable of being displayed a responsive website (responsive websites rely heavily on media queries)?
  • Are we able to display ajax-driven content (the content that loads dynamically without reloading the page) on cellphones?
  • Which devices are we able to display personalised content to, based on their location?
  • Which devices are touchscreen-enabled?
  • What proportion of devices cannot support SVG (scalable vector graphics)?
4 Select approximately 25 class names for an initial taxonomy.

In order to create the taxonomy, the following steps were taken:

  • The useragent strings of 50 000 online orders were extracted and saved to an Excel spreadsheet.
  • Each row was imported into a table in a MySQL database.
  • Duplicate fields were removed and the data was cleaned. This reduced the data from 50 000 row entries to 2 846 rows.
  • A comparison of the free service offering by WURFL and 51Degrees was done:
    • WURFL offered the detection of up to five device properties for free.
    • 51Degrees offered 55+ free device properties and they have an API.
  • 51Degrees was selected as the device detection service to use, and I signed up with them to get an API key.
  • To avoid running too many device detections over a short period of time and potentially getting flagged for abuse, I set up a cronjob which looped through the table rows, querying the API, 100 rows per job.
  • With the data 51Degrees returned - I used the device property descriptions to create the taxonomy, and the device property values to create instances (see Resource 1).
5 The taxonomy must be at least 3 nodes deep. Create the taxonomy and draw it as a graph.

to the top

1 Model concepts based on the taxonomy identified in Part 1.
2 Model approximately 50 relationships, identifying their domain and range.
  • Include 4 OWL subclass restrictions.
  • Include 1 datatype and 1 object property.
  • Include 1 property restrictions each on the domain and range.
  • Include 2 Cardinality restrictions (exact, min and max).
  • Include 1 Existential restriction.
  • Include 1 Universal restriction.
  • Include 1 symmetric, 1 transitive, 1 inverse, 1 inverse functional.

1. Include four OWL subclass restrictions:

Numerous subclass restrictions have been defined in the ontology:

  • Class Android is a subClass of Browser
  • Class Blackberry_Browser is a subClass of Blackberry
  • Class Smart_Device is a subClass of DeviceType
  • Class Hardware_Vendor is a subClass of Vendor
<rdf:Description rdf:about="#Blackberry">
   <rdf:type rdf:resource=""/>
   <bd:isCreatedBy rdf:resource="#Blackberry_Limited"/>
   <rdfs:label rdf:datatype="">Blackberry</rdfs:label>
   <rdfs:subClassOf rdf:resource="#Browser"/>
<rdf:Description rdf:about="#Blackberry_Browser">
   <rdf:type rdf:resource=""/>
   <rdfs:label rdf:datatype="">Blackberry Browser</rdfs:label>
   <rdfs:subClassOf rdf:resource="#Blackberry"/>
<rdf:Description rdf:about="#Smart_Device">
   <rdf:type rdf:resource=""/>
   <rdfs:comment rdf:datatype="">Smart devices refer to everyday appliances that are internet-enabled, for eg. a fridge.</rdfs:comment>
   <rdfs:label rdf:datatype="">Device Type - Smart Device</rdfs:label>
   <rdfs:subClassOf rdf:resource="#DeviceType"/>
<rdf:Description rdf:about="#Hardware_Vendor">
   <rdf:type rdf:resource=""/>
   <rdfs:comment rdf:datatype="">This is a company who only manufactures the hardware of a browsing device.</rdfs:comment>
   <rdfs:label rdf:datatype="">Hardware Vendor</rdfs:label>
   <rdfs:subClassOf rdf:resource="#Vendor"/>


2. Include one datatype and one object property:


Property hasBrowserCapabilitiesDeviceOrientation (a subProperty of hasBrowserCapabilities) has its range set to xsd:boolean.

<rdf:Description rdf:about="#hasBrowserCapabilitiesDeviceOrientation">
   <rdf:type rdf:resource=""/>
   <rdfs:comment rdf:datatype="">This indicates if the browser supports DOM events for device orientation.</rdfs:comment>
   <rdfs:label rdf:datatype="">has Browser Capabilities - Device Orientation</rdfs:label>
   <rdfs:range rdf:resource=""/>
   <rdfs:subPropertyOf rdf:resource="#hasBrowserCapabilities"/>

Object property:

Property isCreatedBy has its range set to class Vendor.

<rdf:Description rdf:about="#isCreatedBy">
   <rdf:type rdf:resource=""/>
   <rdfs:label rdf:datatype="">is created by</rdfs:label>
   <rdfs:range rdf:resource="#Vendor"/>


3. Include one property restriction each on the domain and range:

Restriction on the domain:

The property hasDeviceType has its domain set to Device. This means the property can be reused, but only within the Class Device.

Restriction on the range:

The same property has its range set to DeviceType. This means the property can be reused, but only instances available within the Class Device can be selected.

<rdf:Description rdf:about="#hasDeviceType">
   <rdf:type rdf:resource=""/>
   <rdfs:comment rdf:datatype="">This shows the list of device types associated with a device instance.</rdfs:comment>
   <rdfs:domain rdf:resource="#Device"/>
   <rdfs:label rdf:datatype="">has device type</rdfs:label>
   <rdfs:range rdf:resource="#DeviceType"/>


4. Include two cardinality restrictions (exact, min and max):


A cardinality restriction has been placed on Class Vendor for the property hasHeadQuartersIn. Vendor can have headquarters in only one country.

Line 1:  <rdf:Description rdf:about="#Vendor">
Line 2:    <rdf:type rdf:resource=""/>
Line 3:    <rdfs:comment rdf:datatype="">This is a company who manufactures the hardware and creates the software of a browsing device.</rdfs:comment>
Line 4:    <rdfs:label rdf:datatype="">Vendor</rdfs:label>
Line 5:    <owl:equivalentClass rdf:nodeID="A0"/>
Line 6:  </rdf:Description>
Line 7:  <rdf:Description rdf:nodeID="A0">
Line 8:    <rdf:type rdf:resource=""/>
Line 9:    <owl:onProperty rdf:resource="#hasHeadQuartersIn"/>
Line 10:   <owl:cardinality rdf:datatype="">1</owl:cardinality>
Line 11: </rdf:Description>

Min and max:

A cardinality restriction has been placed on Class Vendor for the property hasOperationsIn. Vendor must have operations in one continent with a maximum of seven continents.

<rdf:Description rdf:about="#Vendor">
   <rdf:type rdf:resource=""/>
   <rdfs:comment rdf:datatype="">This is a company who manufactures the hardware and creates the software of a browsing device.</rdfs:comment>
   <rdfs:label rdf:datatype="">Vendor</rdfs:label>
   <owl:equivalentClass rdf:nodeID="A2"/>
   <owl:equivalentClass rdf:nodeID="A0"/>
<rdf:Description rdf:nodeID="A0">
   <rdf:type rdf:resource=""/>
   <owl:onProperty rdf:resource="#hasOperationsIn"/>
   <owl:minCardinality rdf:datatype="">1</owl:minCardinality>
<rdf:Description rdf:nodeID="A2">
  <rdf:type rdf:resource=""/>
  <owl:onProperty rdf:resource="#hasOperationsIn"/>
  <owl:maxCardinality rdf:datatype="">7</owl:maxCardinality>


5. Include one existential restriction:

For every instance of Class Device, there must exist at least one instance of Class Browser.

<rdf:Description rdf:nodeID="A0">
  <rdf:type rdf:resource=""/>
  <owl:onProperty rdf:resource="#hasBrowser"/>
  <owl:someValuesFrom rdf:resource="#Browser"/>
<rdf:Description rdf:about="#Device">
  <rdf:type rdf:resource=""/>
  <rdfs:comment rdf:datatype="">This shows the browser capabilities for a particular configuration, grouped primarily according to browser and version (for eg. Chrome - V35, V36, and V37), as well as device type and platform.  If a browser and its associated version do not share the same browser capabilities when run on different platforms for example, then a separate instance is recorded.  The label (rdfs:label) is an identifier only.</rdfs:comment>
  <rdfs:label rdf:datatype="">Device</rdfs:label>
  <rdfs:subClassOf rdf:resource=""/>
  <owl:someValuesFrom rdf:nodeID="A0"/>


6. Include one universal restriction:

The Class Telephonic_Device can only include instances from Class Mobile_Device which hasTelephoneCapabilities.

<owl:Class rdf:ID="Telephonic_Device">
  <rdfs:label rdf:datatype=">Device Type - Telephonic Device</rdfs:label>
  <rdfs:comment rdf:datatype="">This class lists mobile devices which have telephonic capabilities.</rdfs:comment>
  <rdfs:subClassOf rdf:resource="#DeviceType"/>
  <owl:allValuesFrom rdf:resource="#Mobile_Device"/>
  <owl:onProperty rdf:resource="#hasTelephoneCapabilities"/>


7. Include one symmetric, one transitive, one inverse and one inverse functional:


The property isConnectedTo is set as a symmetric property.

The isConnectedTo property is then set for Vendor instance Mozilla_Foundation to Vendor instance Mozilla_Corporation. This infers that Mozilla_Corporation isConnectedTo Mozilla_Foundation.

<rdf:Description rdf:about="#isConnectedTo">
  <rdf:type rdf:resource=""/>
  <rdfs:label rdf:datatype="">is connected to</rdfs:label>


Owl:imports is transitive. The instance Logo_Google imports their official logo from Wikipedia.

<rdf:Description rdf:about="#Logo_Google">
  <rdf:type rdf:resource="#Logo"/>
  <rdfs:label rdf:datatype="">Logo - Google</rdfs:label>
  <owl:imports rdf:resource=""/>


The following property has been set: Class Browser isCreatedBy Class Vendor.

The inverse property creates states that Class Vendor creates Class Browser.

<rdf:Description rdf:about="#creates">
  <rdf:type rdf:resource=""/>
  <rdfs:label rdf:datatype="">creates</rdfs:label>
  <owl:inverseOf rdf:resource="#isCreatedBy"/>

Inverse functional:

The instance Australia, for the Class Country, is also a Continent.

<rdf:Description rdf:about="#isContinent">
  <rdfs:domain rdf:resource="#Country"/>
  <rdfs:label rdf:datatype="">is continent</rdfs:label>
  <rdfs:range rdf:resource="#Continent"/>
  <rdf:type rdf:resource=""/>

<rdf:Description rdf:about="#isCountry">
  <rdfs:domain rdf:resource="#Continent"/>
  <rdfs:label rdf:datatype="">is country</rdfs:label>
  <rdf:type rdf:resource=""/>
  <rdfs:range rdf:resource="#Country"/>
4 Create individuals of all the concepts modelled using the properties defined as above, ensuring that the number of concepts remain between 10 and 20.

to the top

1 Choose two competency questions from Part 1.
2 For each of those questions, write a SPARQL query that answers the competency questions. Use query features like FILTER, OPTIONAL, LIMIT, OFFSET, UNION etc in your queries.

Question 1

Which devices are capable of being displayed a responsive website (responsive websites rely heavily on media queries)?

For this query, the properties mediaQueries and hasBrowserCapabilitiesHTMLHtml5 are required to be true.

PREFIX xsd: <>
PREFIX bd: <>

SELECT ?deviceType ?browser ?platform
	?device bd:hasBrowser ?browser;
 		bd:hasDeviceType ?deviceType;
 		bd:hasPlatform ?platform;
 		bd:hasBrowserCapabilitiesCSSMediaQueries ?mediaQueries;
 		bd:hasBrowserCapabilitiesHTMLHtml5 ?html5 .
		FILTER (?mediaQueries = true) .
		FILTER (?html5 = true) 
ORDER BY ?deviceType


Question 2

Are we able to display ajax-driven content (the content that loads dynamically without reloading the page) on cellphones?

Here, the JSON and FormData values are required to be true.

PREFIX xsd: <>
PREFIX bd: <>

	?device bd:hasDeviceType ?deviceType;
		bd:hasBrowserCapabilitiesJSON ?json;
 		bd:hasBrowserCapabilitiesPostMessage ?formData .
		FILTER (REGEX(?deviceType, "^cellphone", "i")) .
		FILTER (?json = true) .
 		FILTER (?formData = true) 

to the top

1 Excel document of the device properties   Download
2 The OWL ontology, created in TopBraid Composer   Download
3 Screenshot - TopBraid Composer - Class Hierarchy   Download
4 Screenshot - TopBraid Composer - Class Overview   Download

to the top

Author: Frances Gillis-Webber. Created: 9 August 2015