API:Query
Language: | English • Deutsch • español • فارسی • 日本語 • русский • 中文 |
---|
![]() |
This page is part of the MediaWiki API documentation. |
Quick overview:
- Quick start guide
- FAQ
- Tutorial
- Formats
- Error reporting
- Restricting usage
- Cross-site requests
- Authentication
- Queries
- Search suggestions
- Parsing wikitext and expanding templates
- Purging pages' caches
- Parameter information
- Changing wiki content
- Watchlist feed
- Wikidata
- Extensions
- Using the API in MediaWiki and extensions
- Miscellaneous
- Implementation
- Client code
- Asserting
The action=query
module allows you to get most of the data stored in a wiki, such as the wikitext of a particular article, or the token you need to change wiki content.
The query module has many submodules (called query modules), each with a different function. There are three types of query modules:
- Meta information about the wiki and the logged-in user
- Properties of pages, including page revisions and content
- Lists of pages that match certain criteria
You should use multiple query modules together to get what you need in one request, e.g. prop=info|revisions&list=backlinks|embeddedin|allimages&meta=userinfo
is a call to six modules in one request.
Unlike meta and list query modules, all property query modules work on a set of pages that you specify using either titles
, pageids
, revids
, or generator
parameters. Use one of the first three if you know the pages' titles, page ids, or revision ids. Do not ask for one page at a time – this is very inefficient, and consumes lots of extra resources and bandwidth. Instead request information about multiple pages by combining their titles or ids with the "|" pipe symbol: titles=PageA|PageB|PageC
.
Use generator
if you want to get data about pages that are the result of another API call. For the API, if you want to get data about pages in a certain category (list=categorymembers
), and then call api with pageids set to all the returned results, you should combine two calls into one by setting generator=categorymembers
instead of the list parameter.
Lastly, you should always request the new "continue" syntax to iterate over results. To use it, always pass an empty continue=
parameter, and check if the result contains a continue
section. If it does, merge returned values with the original request and call the api again. Repeat until there is no more continue
section.
Contents
Sample query[edit | edit source]
Before we get into the nitty-gritty, here's a useful sample query that simply gets the wiki markup (content) of a page:
api.php?action=query&prop=revisions&rvprop=content&format=jsonfm&titles=Main%20Page
This means fetch (action=query) the content (rvprop=content) of the most recent revision of Main Page (titles=Main%20Page) in JSON with whitespace to make it easier to read (format=jsonfm).
Alternatively, you can use action=raw
as a parameter to index.php to get the content of a page: index.php?title=Main%20Page&action=raw
Specifying pages[edit | edit source]
You can specify pages in the following ways:
- By name using the
titles
parameter, e.g.titles=Foo|Bar|Main_Page
- By page ID using the
pageids
parameter, e.g.pageids=123|456|75915
- By revision ID using the
revids
parameter, e.g.revids=478198|54872|54894545
- Most query modules will convert revision ID to the corresponding page ID. Only prop=revisions actually uses the revision ID itself.
- Using a generator
Specifying titles through the query string (either through titles
or pageids
) is limited to 50 titles per query (or 500 for those with the apihighlimits
right, usually bots and sysops).
Title normalization[edit | edit source]
Title normalization converts page titles to their canonical form. This means capitalizing the first character, replacing underscores with spaces, and changing namespace to the localized form defined for that wiki. Title normalization is done automatically, regardless of which query modules are used. However, any trailing line breaks in page titles (\n) will cause odd behavior and they should be stripped out first.
Capitalization, localization, "_" → " " (space), "Project" → "Wikipedia", ...
Result |
---|
{ "query": { "normalized": [ { "from": "Project:articleA", "to": "Wikipedia:ArticleA" }, { "from": "article_B", "to": "Article B" } ], "pages": { "-1": { "ns": 4, "title": "Wikipedia:ArticleA", "missing": "" }, "-2": { "ns": 0, "title": "Article B", "missing": "" } } } } |
Missing and invalid titles[edit | edit source]
Titles that don't exist or are invalid still appear in the <pages>
section, but they have the missing=""
or invalid=""
attribute set. In output formats that support numeric array keys (such as JSON and PHP serialized), missing and invalid titles will have unique, negative page IDs. Query modules will just ignore missing or invalid titles, as they can't do anything useful with them. The titles in the Special: and Media: namespaces cannot be queried. If any such titles are found in the titles=
parameter or passed to a module by a generator, a warning will be issued.
A missing title, an invalid one and an existing one in JSON
Result |
---|
{ "query": { "pages": { "-2": { "ns": 0, "title": "Doesntexist", "missing": "" }, "-1": { "title": "Talk:", "invalid": "" }, "54": { "pageid": 54, "ns": 0, "title": "Main Page", } } } } |
Resolving redirects[edit | edit source]
Redirects can be resolved automatically, so that the target of a redirect is returned instead of the given title. When present, they will always contain from
and to
attributes and may contain a tofragment
attribute for those redirects that point to specific sections.
Both normalization and redirection may take place. In the case of multiple redirects, all redirects will be resolved, and in case of a circular redirect, there might not be a page in the 'pages' section (see also below). Redirect resolution cannot be used in combination with the revids=
parameter or with a generator generating revids; doing that will produce a warning and will not resolve redirects for the specified revids.
The examples below show how the redirects
parameter works.
Using "redirects" parameter. "Main page" is a redirect to "Main Page"
Result |
---|
{ "query": { "redirects": [ { "from": "Main page", "to": "Main Page" } ], "pages": { "1": { "pageid": 1, "ns": 0, "title": "Main Page" } } } } |
Same request but without the "redirects" parameter.
Result |
---|
{ "query": { "pages": { "3": { "pageid": 3, "ns": 0, "title": "Main page" } } } } |
Without "redirects" you may want to use prop=info to obtain redirect status.
Result |
---|
{ "query": { "pages": { "3": { "pageid": 3, "ns": 0, "title": "Main page", "contentmodel": "wikitext", "pagelanguage": "en", "touched": "2015-02-21T13:03:17Z", "lastrevid": 3, "length": 23, "redirect": "", } } } } |
Request with a section link. "Wikipedia:!--" is a redirect to "Wikipedia:Manual of Style#Invisible comments"
Result |
---|
{ "query": { "redirects": [ { "from": "Wikipedia:!--", "to": "Wikipedia:Manual of Style", "tofragment": "Invisible comments" } ], "pages": { "33697": { "pageid": 33697, "ns": 4, "title": "Wikipedia:Manual of Style" } } } } |
Here is a case of a circular redirect: Page1 → Page2 → Page3 → Page1. Also, in this example a non-normalized name 'page1' is used.
Result |
---|
{ "query": { "normalized": [ { "from": "page1", "to": "Page1" } ], "redirects": [ { "from": "Page1", "to": "Page2" }, { "from": "Page2", "to": "Page3" }, { "from": "Page3", "to": "Page1" } ] } } |
Limits[edit | edit source]
See here for more information on limits.
Continuing queries[edit | edit source]
MediaWiki version: | ≥ 1.21 |
- See raw continue for the
query-continue
information
Very often you will not get all the data you want in one request. If there is more data, the result will have a continue
element. Appending contained values to your original request will get the next portion of the data. For backwards compatibility, clients must specify continue=
in their initial queries to select this method (although this is planned to become the default in 1.26).
Using the query-continue value
Result |
---|
{ "continue": { "accontinue": "List_of_19th_century_baseball_players", "continue": "-||" }, "batchcomplete": "", "query": { "allcategories": [ { "*": "List of" }, { "*": "List ofPalestinians" }, { "*": "List of \"M\" series military vehicles" }, { "*": "List of ''The Fast and the Furious'' characters" }, { "*": "List of 100 Deeds for Eddie McDowd" }, { "*": "List of 1919 Actors" }, { "*": "List of 1972 births" }, { "*": "List of 1999 ballet premieres" }, { "*": "List of 19th-century Russian artists" }, { "*": "List of 19th century Russian artists" } ] } } |
You can now add continue=-||
and accontinue=List_of_19th_century_baseball_players
to the original request (the new value for continue
would replace the initial empty string) to get the next set of results. If there is no more results, there will not be a continue
element.
Here is the recommended way to iterate over query results (uses python requests lib). Note that clients should not be manipulating or depending on any specifics of the values returned inside the continue
element, as they may change.
for result in query( {'generator':'allpages', 'prop':'links'} ): # process result data ... def query(request): request['action'] = 'query' request['format'] = 'json' lastContinue = {'continue': ''} while True: # Clone original request req = request.copy() # Modify it with the values returned in the 'continue' section of the last result. req.update(lastContinue) # Call API result = requests.get('http://en.wikipedia.org/w/api.php', params=req).json() if 'error' in result: raise Error(result['error']) if 'warnings' in result: print(result['warnings']) if 'query' in result: yield result['query'] if 'continue' not in result: break lastContinue = result['continue']
Getting a list of page IDs[edit | edit source]
With the indexpageids
parameter, you'll get a list of all page IDs listed in the <pageids>
element. This is particularly useful for formats like JSON in which the pages array has numeric indexes. Getting a list of all page IDs
Result |
---|
{ "query": { "pageids": [ "-2", "-1", "15580374" ], "pages": { "-2": { "ns": 0, "title": "Fksdlfsdss", "missing": "" }, "-1": { "title": "Talk:", "invalid": "" }, "15580374": { "pageid": 15580374, "ns": 0, "title": "Main Page" } } } } |
Exporting pages[edit | edit source]
You can export pages through the API with the export
parameter. If the export
parameter is set, an XML dump of all pages in the <pages>
element will be added to the result. The export
parameter only gives a result when used with specified titles (Generator, titles
, pageids
or revid
). Note that the XML dump will be wrapped in the requested format; if that format is XML, characters like < and > will be encoded as entities (< and >) If the exportnowrap
parameter is also set, only the XML dump (not wrapped in an API result) will be returned.
Exporting the contents of API
Result |
---|
<!-- TODO -->
|
Exporting all templates used in API
Result |
---|
<?xml version="1.0"?> <api> <query> <pages> <page pageid="16385" ns="10" title="Template:API Intro" /> <page pageid="6458" ns="10" title="Template:Languages" /> <page pageid="9631" ns="10" title="Template:Languages/Lang" /> </pages> <export> <!-- XML dump here --> </export> </query> </api> |
See also: Importing pages
Generators[edit | edit source]
With generators, you can use the output of a list instead of the titles
parameter. The output of the list must be a list of pages, whose titles are automatically used instead of the titles
, pageids
or revids
parameter. Other query modules will treat generated pages as if they were given in a parameter. Only one generator is allowed. Some prop modules can also be used as a generator.
Parameters passed to a generator must be prefixed with a g
. For instance, when using generator=backlinks
, use gbltitle
instead of bltitle
.
It should also be noted that generators only pass page titles to the 'real' query, and do not output any information themselves. Setting parameters like gcmprop
will therefore have no effect.
Using list=allpages as generator[edit | edit source]
Get links and categories for the first three pages in the main namespace starting with "Ba"
Result |
---|
<?xml version="1.0" encoding="utf-8"?> <api> <query-continue> <allpages gapfrom="Ba'ad Sneen (Song)" /> </query-continue> <query> <pages> <page pageid="98178" ns="0" title="Ba"> <links> <pl ns="0" title="BA" /> <pl ns="4" title="Wikipedia:Redirect" /> <pl ns="4" title="Wikipedia:Template messages/Redirect pages" /> <pl ns="10" title="Template:R from alternative name" /> <pl ns="10" title="Template:R from alternative spelling" /> <pl ns="14" title="Category:Redirects from other capitalisations" /> </links> <categories> <cl ns="14" title="Category:Redirects from other capitalisations" /> <cl ns="14" title="Category:Unprintworthy redirects" /> </categories> </page> <page pageid="14977970" ns="0" title="Ba'"> <links> <pl ns="0" title="Kirkwall Ba game" /> </links> </page> <page pageid="10463369" ns="0" title="Ba'Gamnan"> <links> <pl ns="0" title="Characters of Final Fantasy XII" /> </links> </page> </pages> </query> </api> |
Generators and redirects[edit | edit source]
Here, we use prop=links as a generator. This query will get all the links from all the pages that are linked from Title. For this example, assume that Title has links to TitleA and TitleB. TitleB is a redirect to TitleC. TitleA links to TitleA1, TitleA2, TitleA3; and TitleC links to TitleC1 and TitleC2. Redirect are solved because the redirects
parameter is set.
The query will execute the following steps:
- Resolve redirects for titles in the
titles
parameter - For all the titles in the
titles
parameter, get the list of pages they link to - Resolve redirects in that list
- Run the prop=links query on that list of titles
Using redirect resolution with generators
Result |
---|
<?xml version="1.0" encoding="utf-8"?> <api> <query> <pages> <page pageid="32" ns="0" title="TitleA"> <links> <pl ns="0" title="TitleA1" /> <pl ns="0" title="TitleA2" /> <pl ns="0" title="TitleA3" /> </links> </page> <page pageid="54" ns="0" title="TitleC"> <links> <pl ns="0" title="TitleC1" /> <pl ns="0" title="TitleC2" /> </links> </page> </pages> <redirects> <r from="TitleB" to="TitleC" /> </redirects> </query> </api> |
Generators and continuation[edit | edit source]
You can continue queries using a generator the same way as other queries. In the first call to the API the generator will create a batch of titles the actual query will work on. Continuing this query will first continue the actual query, until you got all data about the first batch. The next continuation will then create a new batch from the generator and so on. If you use rawcontinue
, please read API:Raw Query Continue to understand which parameters you have to include in the continuation queries. If instead you use continue
, you simply pass all parameters back, as you do for queries without generator. The result will have the batchcomplete
property set every time a batch of titles is completed. This enables you to process that batch before continuing the query. Please note that for generators used together with a non-query module, the continue
format will always be used.
More generator examples[edit | edit source]
- Show info about 4 pages starting at the letter "T"
- http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=4&gapfrom=T&prop=info
- Show content of first 2 non-redirect pages beginning at "Re"
- http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=2&gapfilterredir=nonredirects&gapfrom=Re&prop=revisions&rvprop=content
Page types[edit | edit source]
Page type | Example | Used in the given page(s) | Which pages have it | List all in the wiki |
---|---|---|---|---|
Page link | [[Page]] |
prop=links | list=backlinks | list=alllinks |
Template transclusion | {{Template}} |
prop=templates | list=embeddedin | list=alltransclusions |
Categories | [[category:Cat]] |
prop=categories | list=categorymembers | list=allcategories |
Images | [[file:image.png]] |
prop=images | list=imageusage | list=allimages |
Language links | [[ru:Page]] |
prop=langlinks | list=langbacklinks | |
Interwiki links | [[meta:Page]] |
prop=iwlinks | list=iwbacklinks | |
URLs | http://mediawiki.org |
prop=extlinks | list=exturlusage |
Possible warnings[edit | edit source]
- No support for special pages has been implemented
- Thrown if a title in the Special: or Media: namespace is given
- Redirect resolution cannot be used together with the revids= parameter. Any redirects the revids= point to have not been resolved.
- Note that this can also be caused by a generator that generates revids