What is Swoogle?
Started as a research project of the Ebiquity research group in University of Maryland. Swoogle is a search engine for Semantic Web ontologies, documents, terms, and data published on the Web. It distributes online repository of SWDs. It is a crawler-based indexing and retrieval system for Semantic Web. Crawls and discovers documents written in RDF,OWL. It provides services to human users through a browser interface and to software agents via RESTful web services.
Objective of Swoogle
- More and more SWDs, both ontologies and instances physically distributed all over the web.
- A retrieval system that organizes these documents in a systematic way
- Both humans and agents can easily conduct searches and queries against this repository
Why we use Swoogle?
- Avoid creating new ontologies.
- Need for reuse.
- Search Semantic Web ontologies
- Search Semantic Web instance data.
- Search Semantic Web terms, i.e., URIs that have been defined as classes and properties.
- Provide metadata of Semantic Web documents and support browsing the Semantic Web.
- Archive different versions of Semantic Web documents
What Swoogle search?
- Find if suitable ontologies matching the user’s need already exist within underlying domain.
- User inputs specific term
- Swoogle replies with existing ontologies that also use the term entered.
- Follow the link and see whether the provided ontology satisfies the need.
- Query SWDs with constraints on classes and properties used by them.
- SWD discovery component — This component has two distinct Web crawlers that discover SWDs distributed all over the web. These two crawlers can be invoked periodically to keep updated information about SWDs.
- Metadata creation component — The metadata creation component creates metadata for each SWD.
- Data analysis component — This component uses the metadata information to classify the relationship among the given set of SWDs and further calculates the rank of each SWD.
- Indexation and retrieval component — Swoogle is after all a search engine, and therefore indexation and retrieval are necessary. Details of this component will be discussed later in this section.
- User interface — This is what the user sees when using the Swoogle search engine.
- Crawler visits the web to collect SWDs, ignoring all other documents (html, pdf, image files)
- For each SWD discovered, Swoogle extracts metadata from the document and indexes it into an information retrieval system for later searches and queries.
How does Swoogle crawl the semantic web?
- Manual submission
- Google-based meta-crawling
- Bounded HTML crawling
- RDF crawling