Extension:EventLogging/Programming

From MediaWiki.org
Jump to: navigation, search
See Extension:EventLogging/Guide for a comprehensive introduction to EventLogging, developing and deploying EventLogging schemas, and more.

How it works[edit | edit source]

When code logs an event, it must reference a schema. Here's some actual working JavaScript:

var action = 'gettingstarted-impression';
mw.eventLog.logEvent( 'GettingStarted', {
    action : action,
    page   : mw.config.get( 'wgTitle' )
} );

In this case the schema is "GettingStarted". Schema: page on Meta. The name should use InitialCaps.

  1. The schema is a JSON structure that specifies the fields in the event — their names, their types (integer, string, boolean, etc.), whether required or not, allowed values, ... It is stored in the Schema: namespace of some wiki running the extension.
  2. PHP code in core or an extension explicitly depends on a particular revision of a particular data model, and retrieves it from the schema host (configured as $wgEventLoggingSchemaIndexUri).
    • For client-side event logging the MediaWiki ResourceLoader gets a particular revision of a schema from meta-wiki's Schema: namespace (e.g. http://meta.wikimedia.org/wiki/Schema:GettingStarted revision 4867730), caches it, and makes it available to client JavaScript such as the call to logEvent() above.
    • For server-side event logging, PHP code simply calls efLogServerSideEvent( SchemaName, revision, array( my fields ) )

How to make a data model[edit | edit source]

  • Meet a researcher and determine what you're going to log, name the fields to log, reusing well-known field names.
  • Create a JSON structure representing this data model in the Schema: namespace on meta, tweak it until it saves without errors.
    • Sample: m:Schema:OpenTask
    • Tip: http://jsonlint.com/ has better error reporting, copy and paste your JSON into it.
    • Tip: if you have a JSON file with desired fields and values, http://www.jsonschema.net/ will guess at a schema for it (but with extra info like "id" that we don't currently use) that you can start with.
  • Use the schema's talk page (sample) to link to experiments using this, discuss details, etc.
    • Always document what code in what circumstances logs the event

Then:

  • Developers write code to log events that match the data model.
  • The data model tells analysts what information is in the logs.

Versioning[edit | edit source]

If code tries to log an event that doesn't match the data model that EventLogging retrieved, EventLogging will log the event anyway but flag it as invalid. Since you always give a schema revision, you can edit the schema as much as you want without affecting existing code.

It's OK to have different kinds of events (often called actions) sharing one data model. That way the events go into one table and it may simplify querying and multi-dimensional analysis. Only add "required":true to the fields that are applicable to all events.

Data fields[edit | edit source]

moved to Extension:EventLogging/Guide#Data_fields

Available data models[edit | edit source]

See m:Category:Schemas (active).

Implementation notes[edit | edit source]

JSON schema validation[edit | edit source]

Each data model JSON file on meta-wiki is a JSON schema. This is an evolving standard to specify the format of JSON structures, in our case the logged event.

  • the JSON schema draft].
  • As of December 2012 EventLogging only validates that the schemas on meta are valid JSON.
  • When code attempts to log an event, EventLogging only pays attention to a subset of JSON schema features; as of November 2012 this includes:
    • type: boolean, integer, number, string, timestamp
    • required: true/false
    • enum values

Programming topics[edit | edit source]

Good starting code[edit | edit source]

  • The WikimediaEvents extension has working code to log server-side events in PHP in WikimediaEvents.php.
  • The GettingStarted extension has setup code to declare and require the "openTasks" schema resource and log to it in JavaScript, but it's gotten more elaborate in 2013.

Client-side logging[edit | edit source]

    • require your schema wherever you need to log events (it will pull in ext.eventLogging which implements the mw.eventLog module).
  • See modules/ext.eventLogging.core.js for API documentation.

Tips[edit | edit source]

  • In JavaScript code, use mw.eventLog.setDefaults() to set common values for fields to log that don't change, such as version, the user's name, etc.
  • Extension:EventLogging/Guide#Data fields lists common field names already used in schemas and the JavaScript that fills them. Don't reinvent the wheel.

Debugging[edit | edit source]

If code attempts to log an invalid event, EventLogging detects it's invalid and flags it, but logs it anyway. Turn on ResourceLoader debug mode to see warnings in your browser's JavaScript console about invalid events. The most likely cause of an invalid events is your page doesn't load the right schema. In your browser's JavaScript console, enter mw.eventLog.schemas to see which schemas have been loaded on the current page.

Monitoring events[edit | edit source]

  • Client-side event logging works by requesting a beacon image event.gif with the log info in its query string. To see the log events you can
    • watch for this request in your browser's network console,
    • look for it in your web server's access logs,
    • run the toy web server server/bin/eventlogging-devserver in the EventLogging extension which pretty-prints the query string,
  • To monitor events after processing, you can append an always callback after a logEvent call, for example the following temporary debugging code
    mw.eventLog.logEvent('MySchema', {foo: 'bar'}).always( function (event) {
            if ( event.clientValidated ) {
                console.log('It validated!');
            } else {
                console.warn('It did not validate!');
            }
    } );

Logging clicks on links[edit | edit source]

Often you want to log clicks on links. If these take the user away from the current page, there's a chance that the browser will move to the new page before the request for the beacon image makes it onto the network, and the browser will drop the request. The E3 team experimented with using deferred promises to deal with this, but that introduced known and unknown unknowns. bug 42815 is related to this issue.

There are significant performance concerns regarding logging before showing the next page and our recommendation is not to do that until the new beacon API becomes available [1]. Details on performance issues can be found here: https://bugzilla.wikimedia.org/show_bug.cgi?id=52287

See also[edit | edit source]