Avineon Tensing

Using FME + AI for ArcGIS Feature Service metadata creation

Share

Using FME + AI for ArcGIS Feature Service metadata creation

Metadata serves as the unsung hero, quietly ensuring that spatial data can be discovered, understood and used as intended. Despite its importance, metadata creation remains one of the most neglected aspects of geospatial data management. There are ways to populate certain fields automatically, but the ones of value often require tedious manual effort for the metadata to be of value. It’s common to see partially completed metadata often inconsistent and in some cases entirely missing key information. This makes finding data a challenge and generally leads to headaches.

At the FME UC The Peak of Data and AI in Seattle earlier in the year Grace Cai the Innovation Lead in Shell presented on AI Agents & Metadata. She also happens to be re-running the presentation in an upcoming Safe Software webinar

During the session Grace explored the trials and tribulations faced when using FME and AI to generate high-quality synthetic metadata. Sharing many of the prompts used to get the best out of the AI. Grace called out the need for crowdsource evaluations and prompt suggestions - well we took inspiration from the talk and are happy to share a workflow for metadata attribution within ArcGIS Online (AGOL) and Portal. All using the powerful FME and AI combo.

Together, FME's powerful workflow automation alongside the flexibility to use the latest LLMs, we can transform what was once a burdensome manual process into an more streamlined, consistent and value-adding component within a spatial data management strategy.

The Metadata Challenge: A Persistent Headache

Before diving into solutions, let's acknowledge the scope of the problem. Missing or incomplete metadata creates a cascade of issues for organisations:

  • Staff spend countless hours searching for data without adequate descriptions. Without proper tagging and descriptions, even the most valuable datasets remain hidden.
  • Valuable institutional knowledge about datasets disappears when metadata is absent.
  • Compliance – the non-compliant metadata backlog.
  • Datasets without context lose significant analytical potential, not only that but to join the bandwagon phrasing the data isn’t ‘AI Ready’.

The Solution: An Intelligent, Automated Workflow

Using FME and AI provides a solution that can automate the entire pipeline from data ingestion to a fully documented ArcGIS Online feature service.

Metadata serves as the unsung hero, quietly ensuring that spatial data can be discovered, understood and used as intended. Despite its importance, metadata creation remains one of the most neglected aspects of geospatial data management. There are ways to populate certain fields automatically, but the ones of value often require tedious manual effort for the metadata to be of value. It’s common to see partially completed metadata often inconsistent and in some cases entirely missing key information. This makes finding data a challenge and generally leads to headaches. Here is a workflow summary, followed by the breakdown using our Red Squirrel Sightings dataset:

At the FME UC – the Peak of Data and AI in Seattle earlier in the year Grace Cai the Innovation Lead in Shell presented on AI Agents & Metadata. She also has re-running the presentation and its available under the Safe Software webinar page

During the session Grace explored the trials and tribulations faced when using FME and AI to generate high-quality synthetic metadata. Sharing many of the prompts used to get the best out of the AI.

One of the calls was for Crowdsource evaluations and prompt suggestions, well we took inspiration from the talk and are happy to share a workflow for metadata attribution within ArcGIS Online (AGOL) and Portal. All using the powerful FME and AI combo.

Together, FME's powerful workflow automation alongside the flexibility to use the latest LLMs, we can transform what was once a burdensome manual process into an more streamlined, consistent and value-adding component within a spatial data management strategy.

The Metadata Challenge: A Persistent Headache

Before diving into solutions, let's acknowledge the scope of the problem. Missing or incomplete metadata creates a cascade of issues for organisations:

  • Staff spend countless hours searching for data without adequate descriptions. Without proper tagging and descriptions, even the most valuable datasets remain hidden.
  • Valuable institutional knowledge about datasets disappears when metadata is absent.
  • Compliance – the non-compliant metadata backlog.
  • Datasets without context lose significant analytical potential, not only that but to join the bandwagon phrasing the data isn’t ‘AI Ready’.

The Solution: An Intelligent, Automated Workflow

Using FME and AI provides a solution that can automate the entire pipeline from data ingestion to a fully documented ArcGIS Online feature service.

Using FME + AI for ArcGIS Feature Service metadata creation

Here is a workflow summary, followed by the breakdown using our Red Squirrel Sightings dataset:

1. Data Ingestion & Service Creation

In this scenario, we are migrating an old MapInfo TAB file of Red Squirrel Sightings across Scotland to ArcGIS Online. This could be any geospatial data you want to publish to your ArcGIS environment – FME supports over 450 data formats. Using FME, we dynamically read in the dataset while also generating the service's name and title, establishing consistent naming conventions, and then publishing the Feature Service.

2. Spatial Context and Schema Analysis

We don't want to simply transfer the data. In the workflow, we first gain an understanding of the dataset's spatial context and its underlying data structure.

  • Using FME to calculate the bounding box of the input data and reproject it to WGS84, which is required for ArcGIS Online item details.
  • After the feature service is created, the FME workflow reads the data back and with a bit of SQL generates a detailed summary of the schema.

3. AI-Powered Metadata Generation

Here's where AI comes into the process as we build some prompts. The quality of the AI-generated metadata is determined by our prompt. By embedding the schema of our dataset into the prompt, we can utilise AI to produce metadata that is accurate and contextually aware.

The workspace makes two separate API calls to Google's Gemini.

  1. The first call, using the above prompt, requests a summary, detailed description, and relevant tags for the feature service.
  2. The second call requests individual descriptions for each field in the dataset, providing granular metadata at the attribute level.

Prompt Code Block

Structured Output Code Block

By using FME to automate the workflow, it is just as important to have a detailed prompt as it is to have a structured output. Providing the AI with a schema for the structured output means we are able to automate the process and ensure we get consistent JSON responses.

4. ArcGIS Online Service Enhancement

With AI-generated metadata in hand, the workflow enriches the ArcGIS Online service:

  • Item Details Update: The ArcGISOnlineConnector is used to update the newly created feature service's item page with the AI-generated summary, description, and tags.
The Squirrel Feature Service with AI populated description and tags
  • Field Descriptions: Additional API calls to the ArcGIS REST API update metadata for each field with the AI-generated descriptions.
  • Sharing Configuration: The workspace shares the new feature service with specific organisational groups in ArcGIS Online, ensuring appropriate access.
Each attribute field within the Feature Dataset has been populated with a Description to support search

Considerations and Best Practices

While our FME workflow provides a suitable starting point, successful implementation requires attention to several key factors:

  • The AI is only as good as the data it analyses. Before adopting a process like this you would want to ensure that your data is complete, clean and well-structured.
  • A generic prompt will give you generic results. Prompts will perform better when they are customised to include relevant terminology, domain understanding and context, recognise important patterns (e.g. ecological survey data vs. utility network data), and align with any internally used acronyms!
  • AI is not infallible and can always "hallucinate" or misinterpret context. A robust quality control process is essential to maintain trust. This also speaks to Grace’s call to have a feedback loop and to crowdsource evaluations.

Exciting times for Geospatial Metadata

The automation of metadata creation through FME and AI is now a viable technical advancement that offers a shift in how we approach metadata management. Eliminating most (*importantly not all*) of the manual effort in creating and maintaining metadata, we not only save valuable time but also enhance the overall value and utility of spatial data assets.

Looking to the future

Grace is already way ahead on this one, suggesting that in time LLMs can be used as judges within an Agentic AI workflow to ensure metadata is evaluated to a high quality. Check out the upcoming Safe Software webinar for more insights and inspiration!