What has the Semantic Web ever done for BIM?

If you don’t know what the semantic web is then here is a link to Wikipedia https://en.wikipedia.org/wiki/Semantic_Web

“The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries”

The sematic web was discussed in the late 1990’s popularised in the 2000’s, standards produced and revised resulting in schemas and some industries using it extensively in the web catalogues.

Why is this important to the world of AEC (Architecture, Engineering & Construction)?

The growth of applications which become storage and searchable vessels of data makes for the opportunity to sort and filter output specifically targeted to meet a requirement. This is the premise how data in BIM can flow to AIM providing it exists. This relies on getting data into BIM that is required by AIM.



If we apply the typical digital adoption graph to BIM in AEC (Architecture, Engineering & Construction) is probably in Late Majority segment. Digital product data, however, is more likely in the Innovator to Early Adopters segments.

I say this because we see a lot of data being hand crafted from spreadsheets, cut and pasted from pdf and web pages. Those trying to do this totally in the BIM model with components have struggled obtaining the required data. The workflow, design through to construction, puts the contractual responsibility on the contractor’s who will not all be able to populate the BIM models. The Asset Data deliverable was always expected to be a combination of data from the model and other sources.

What was not appreciated was the volume of data that comes from the “other sources” and how difficult it would be to provide manually in the data exchange format. This is highlighted in the checking and validation programs being used by information managers, however there is only anecdotal evidence for this and no control surveys of the number of iterations until a pass is achieved.

Storing and using product data for projects provides efficiencies but there will always be a need to add new products. Using files and folder structures becomes increasingly difficult to manage. Using BIM modelling software to do this requires developing centralize libraries and training operatives to manage the quality and integrity of the data.

Many manufacturers who have invested in the development of models, in the expectation that their products would be specified, have not seen any ROI (Return On Investment). They are now quite sceptical, and have halted, or curtailed further investment. The BIM object library businesses will start to feel the squeeze very rapidly unless they can offer a simple way to leverage the data, they now have, using other means.

If you wanted to manage your own product data library, the nearest off the shelf ones are ecommerce applications and would require customisation to provide exports in the required format. They are designed to integrate with selling platforms like shopify, amazon and ebay.

There are few product libraries designed specifically for AEC and Facility management, and even fewer that integrated into a network where product data can be replicated, extended and tested against requirements. The ability to test against requirements is a critical feature in managing the resources working on the data avoiding costly over delivery. (https://products.activeplan.com being one of these applications.

Leveraging web automation methods to build data sets, that meet most requirements, using existing web catalogues exploits structure but would be easier if the pages used recognised semantics in a common structure.

This is where the adoption of the semantic web is critical to providing the environment that is serving the suppliers and manufacturers of products and the data needs of end recipients. It is a Low cost, low risk for everyone and the best option for the industry.

This is how it works.

The semantic web, as proposed by Tim Berner-Lee used the attributes in any HTML tag, this is standard in eXtensible Markup language (XML).

The semantic tagging comes in a few flavours: microdata, RDF, RDFa and RDFa-lite (Resource Description Framework in attributes) and was first defined in 2008.

HTML 5 is the latest incarnation of the W3C HTML specification and includes some standard semantic tags: These tags can all be styled but most importantly read by machines. All web designers can use these tags to help maximise indexing of search engine SEO (Search Engine Optimisation).


When I created and let loose our DEAUS robot on manufacturers web catalogues and merchants’ sales web sites, I was astounded and could not believe how easy it was. Some of the interpretation was using the semantics above, but mostly I had to use styling class names.

Just to be clear the use of styling the class attribute in the tag does not improve the SEO.


The use of semantics relies on using an industry standard. https://schema.org founded by Google, Microsoft, Yahoo and Yandex provides sets of schemas for a variety of information sets including Product and Organisation. The same schemas are used in both microdata and RDFa syntax. Web designers who implement these semantic schemas should achieve better search rankings.

Here is a link for leveraging the google search engine.


Most web catalogues display product pages based on a common layout template. Each manufacturers template is different, and I had to train my robot for each web catalogue.

Every manufacturer wants their products to be discovered by search engines, most not all, put at the root of a site a file named robots.txt to help search engine web crawlers discover appropriate pages.

Below is an example content.


The practice is to reference another file like sitemap.xml which can look like this example.

sitemap.xml which contains the endpoints of all the pages the site administrator wants to be ‘scraped’ by the robot. Web page scraping has been around for a long time, initially to cache pages locally when the cost of having an internet connection was costly. Search engines use this method and read the content of the page, by caching it and then processing the content like that found in the top level meta data Head tag.

The development of web services and use of api’s has provided cool methods of reading just data, but implementation requires that companies invest in creating web sites that are data driven. It may seem complicated for most marketing departments who judge design work by how it looks to humans not machines. There are free online checkers for valid HTML like https://validator.w3.org and subscription SEO checking which will check the semantic tagging and microdata.

The semantic web for some industries has been the answer where pages of information neatly laid out in sections of narrative are peppered with data tagging. Hidden from the human but digestible by the machine, the best of both worlds. Google provides a free to use URL input structured semantic/microdata testing tool https://search.google.com/structured-data/testing-tool which is a good way to peek under the hood with a more diagnostic view. I’ve tried it out on BIM libraries, Product Portals, Builders merchants and some product web catalogues, judge for yourself.

Companies whose web catalogue already use a standard templates for product pages can quickly scale up to take advantage of the semantic web, using existing standards.

Q: Why has this not happened in the BIM world where product data is being asked for in deliverables?

A: There is no financial return from providing data after a purchase.

Two elements are needed

  1. A schema, Web Ontology language (OWL) a family of knowledge representations. (It already exists)
  2. Mass Industry adoption (the reason has to be sales driven)

BuildingSMART have developed the ontology IfcOWL


COBieOWL has also been documented but you won’t find it on the public web.

The industry could wait for their own RDFa but I would argue the majority of attributes already exist in the existing standards and if it is to be sales driven then it does not make sense developing a competing standard that returns no value for manufacturers sales, unless it is government mandated. The big question here is why would you reinvent the wheel? to make it more round!

Using 6 of the cobie tables COBie.Type, COBie.Document, COBie.Attribute, COBie.Contact, COBie.Job, COBie.Issue and COBieOWL we can create a mapping to the existing schema standards for Organisation, Product and Product Model. Demonstrated below is my interpretation of the primary Cobie.Type table.


“What has the semantic web ever done for BIM?”

“At the moment, not a lot but it could turbo charge filling the data gap at really low cost”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s