Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Citation meta to include Citation frequency/count #22

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
- `$GLOBALS['scigReferenceListCacheType']` to disable caching for the reference list, use setting
[`CACHE_NONE`][mw-cachetype] otherwise the cache is being renewed an each new revision or when
the page is purged
- `$GLOBALS['scigStrictParserValidationEnabled']` whether a strict validation of input data for
- `$GLOBALS['scigEnabledStrictParserValidation']` whether a strict validation of input data for
the `{{#scite:}}` parser should be carried out or not
- `$GLOBALS['scigEnabledCitationTextChangeUpdateJob']` whether an update job should be dispatched
for changed citation text entities or not
Expand Down
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,9 @@ but can also be executed using `composer phpunit` from the extension base direct

[GNU General Public License, version 2 or later][gpl-licence].

Part of the bibtex related parsing code has been reused from the http://bibliophile.sourceforge.net project
which is released under the GPL license.

[smw]: https://github.com/SemanticMediaWiki/SemanticMediaWiki
[contributors]: https://github.com/SemanticMediaWiki/SemanticCite/graphs/contributors
[travis]: https://travis-ci.org/SemanticMediaWiki/SemanticCite
Expand Down
15 changes: 11 additions & 4 deletions SemanticCite.php
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
return 1;
}

define( 'SCI_VERSION', '1.0.0' );
define( 'SCI_VERSION', '1.1.0-alpha' );

/**
* @codeCoverageIgnore
Expand Down Expand Up @@ -166,14 +166,20 @@
* Whether a strict validation on behalf of the #scite parser should be
* enabled or not
*/
$GLOBALS['scigStrictParserValidationEnabled'] = true;
$GLOBALS['scigEnabledStrictParserValidation'] = true;

/**
* Whether an update job should be dispatched for changed citation text
* entities or not
*/
$GLOBALS['scigEnabledCitationTextChangeUpdateJob'] = true;

/**
* Whether an article should collect meta information about citation
* resources (i.e citation frequency etc.)
*/
$GLOBALS['scigEnabledCitationMetaRecord'] = false;

// Finalize registration process
$GLOBALS['wgExtensionFunctions'][] = function() {

Expand All @@ -188,9 +194,10 @@
'tooltipRequestCacheTTL' => $GLOBALS['scigTooltipRequestCacheTTLInSeconds'],
'citationReferenceCaptionFormat' => $GLOBALS['scigCitationReferenceCaptionFormat'],
'referenceListType' => $GLOBALS['scigReferenceListType'],
'strictParserValidationEnabled' => $GLOBALS['scigStrictParserValidationEnabled'],
'enabledstrictParserValidation' => $GLOBALS['scigEnabledStrictParserValidation'],
'cachePrefix' => $GLOBALS['scigCachePrefix'],
'enabledCitationTextChangeUpdateJob' => $GLOBALS['scigEnabledCitationTextChangeUpdateJob']
'enabledCitationTextChangeUpdateJob' => $GLOBALS['scigEnabledCitationTextChangeUpdateJob'],
'enabledCitationMetaRecord' => $GLOBALS['scigEnabledCitationMetaRecord']
);

$applicationFactory = ApplicationFactory::getInstance();
Expand Down
35 changes: 20 additions & 15 deletions docs/04-scite.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,14 @@ a `Byrne 2008` resource the short or the explicit reference parameter form can b

If it becomes necessary to rename a citation key (because a resource with key `Foo 2007`
no longer represents a unique resource due to adding another resource with the same key)
then the existing usage of that resource needs to be queried and changed before applying
then it is recommended that the existing usage is queried and changed before applying
the new citation key (e.g. `Foo 2007a`).

## Citation text

The property `Citation text` contains the formatted output of a citation resource and is
used when the [referencelist](05-referencelist.md) is generated. The text is formatted using
assinged template or can be added directly (without further processing) in its final form
used when the [referencelist](05-referencelist.md) is generated. The text is formatted either by
an assigned template or can be added directly (without further processing) in its final form
to the `|citation text=` parameter.

```
Expand All @@ -73,8 +73,8 @@ to the `|citation text=` parameter.
```

In case the parameter `|citation text=` is not declared then `#scite` is going to try to determine
an a template by first looking at the `|template=` parameter and if such parameter is not assigned
then the [template](03-template-mapping.md) assigned to the type of the resource
a template by first looking at the `|template=` parameter and if such parameter is not assigned
then the [template](03-template-mapping.md) mapped to the type of the resource
is used for processing to return a formatted text value.

```
Expand All @@ -88,23 +88,23 @@ is used for processing to return a formatted text value.
}}
```

If `$GLOBALS['scigEnabledCitationTextChangeUpdateJob']` is set true then a change to
If `$GLOBALS['scigEnabledCitationTextChangeUpdateJob']` is set true then an alteration to
a citation text will initiate an update job for those pages that make reference to the
related citation resource.
changed citation resource.

## Type assignment

A type assignment is expected for each citation resource unless `$GLOBALS['scigStrictParserValidationEnabled']`
A type assignment is required for each citation resource unless `$GLOBALS['scigEnabledStrictParserValidation']`
is set `false`.

If multiple types are assigned (e.g.`|type=bgn:Thesis;schema:Book|+sep=;`) then
the last entry (e.g. `schema:Book`) will be selected as valid type descriptor.
the last entry (e.g. `schema:Book`) will be used as type descriptor.

## Bibtex record import

To ease the reuse of bibtex records, `#scite` provides the `|bibtex=` parameter to
import a bibtex formatted text to create annotatable record following
the assignments declared in the `MediaWiki:` [property](02-property-mapping.md) and
import a bibtex formatted text to create an annotatable record that corresponds to
the mapping found in the [property](02-property-mapping.md) and
[template](03-template-mapping.md) page.

```
Expand All @@ -121,7 +121,7 @@ YEAR=2000
}}
```

### Author list
### Bibtex author list

Authors (e.g. `Einstein, Albert and Podolsky, Boris and Rosen, Nathan`) will be split
into an author list of natural representations (`Albert Einstein` etc.) while the original
Expand All @@ -142,12 +142,12 @@ annotation string is still available using the hidden `bibtex-author` parameter.
}}
```

### Content formatting
### Bibtex content formatting

`@article` is parsed as type `article` that can be assigned to a specific [template](03-template-mapping.md)
containing the rules of how text elements are to be formatted. Please be aware
that no automatic clean-up is done on elements like containing `{`/`}` or new lines as in
`in \n SUSY`. Furthermore, complex expressions (those involve macros etc.) are
that no automatic clean-up is done on elements like containing `{`/`}` or `in \n SUSY`, and
complex expressions (those involve macros etc.) are
not parsed or resolved.

```
Expand All @@ -170,5 +170,10 @@ not parsed or resolved.
}
}}
```
## Citation meta record

`$GLOBALS['scigEnabledCitationMetaRecord']` can be enabled to generate additional record data about the
references used (e.g citation frequency) to help answer questions like "How often is a reference cited
within an article?" or "How often is a reference cited within a wiki?.

[smw-ns]: https://semantic-mediawiki.org/wiki/Help:$smwgNamespacesWithSemanticLinks
124 changes: 124 additions & 0 deletions src/CitationMeta.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
<?php

namespace SCI;

use SMW\Subobject;
use SMW\DIWikiPage;
use SMW\DIProperty;
use SMWDIContainer as DIContainer;
use SMWContainerSemanticData as ContainerSemanticData;
use SMW\DataValueFactory;
use SMW\SemanticData;

/**
* @license GNU GPL v2+
* @since 1.0
*
* @author mwjames
*/
class CitationMeta {

/**
* @var CitationReferencePositionJournal
*/
private $citationReferencePositionJournal = null;

/**
* @var boolean
*/
private $isEnabled = false;

/**
* @since 1.1
*
* @param CitationReferencePositionJournal $citationReferencePositionJournal
*/
public function __construct( CitationReferencePositionJournal $citationReferencePositionJournal ) {
$this->citationReferencePositionJournal = $citationReferencePositionJournal;
}

/**
* @since 1.1
*
* @param $isEnabled boolean
*/
public function setEnabledState( $isEnabled ) {
$this->isEnabled = (bool)$isEnabled;
}

/**
* @since 1.1
*
* @param SemanticData $semanticData
*
* @return boolean
*/
public function addMetaRecordToSemanticData( SemanticData $semanticData ) {

if ( !$this->isEnabled ) {
return false;
}

$containerSemanticData = $this->tryToCollectCitationFrequency(
$semanticData->getSubject()
);

if ( $containerSemanticData === null || $containerSemanticData->isEmpty() ) {
return false;
}

$semanticData->addPropertyObjectValue(
new DIProperty( PropertyRegistry::SCI_CITE_META ),
new DIContainer( $containerSemanticData )
);

return true;
}

private function tryToCollectCitationFrequency( DIWikiPage $subject ) {

$journal = $this->citationReferencePositionJournal->getJournalBySubject( $subject );

if ( $journal === array() || !isset( $journal['reference-list'] ) ) {
return null;
}

$subWikiPage = new DIWikiPage(
$subject->getDBkey(),
$subject->getNamespace(),
$subject->getInterwiki(),
'sci.meta'
);

$containerSemanticData = new ContainerSemanticData( $subWikiPage );

foreach ( $journal['reference-list'] as $hash => $citationKey ) {

if ( !isset( $journal['reference-pos'][$hash] ) ) {
continue;
}

$this->addFrequencyRecord(
$containerSemanticData,
$subject,
$citationKey,
count( $journal['reference-pos'][$hash] )
);
}

return $containerSemanticData;
}

private function addFrequencyRecord( $containerSemanticData, $subject, $citationKey, $count ) {

$dataValue = DataValueFactory::getInstance()->newPropertyObjectValue(
new DIProperty( PropertyRegistry::SCI_CITE_FREQUENCY ),
$citationKey . ';' . $count,
false,
$subject
);

$containerSemanticData->addDataValue( $dataValue );
}

}
10 changes: 10 additions & 0 deletions src/CitationResourceMatchFinder.php
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

use SMW\Query\Language\SomeProperty;
use SMW\Query\Language\ValueDescription;
use SMW\Query\Language\Conjunction;
use SMW\Query\Language\ThingDescription;
use SMW\Query\PrintRequest;
use SMW\Store;
use SMW\DIProperty;
Expand Down Expand Up @@ -165,6 +167,14 @@ public function findMatchForCitationReference( $citationReference ) {
new ValueDescription( new DIBlob( $citationReference ) )
);

$description = new Conjunction( array( $description ) );
$description->addDescription(
new SomeProperty(
new DIProperty( PropertyRegistry::SCI_CITE_TEXT ),
new ThingDescription()
)
);

$propertyValue = new PropertyValue( '__pro' );
$propertyValue->setDataItem(
new DIProperty( PropertyRegistry::SCI_CITE_TEXT )
Expand Down
11 changes: 3 additions & 8 deletions src/CitationTextChangeUpdateJobDispatcher.php
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ public function dispatchUpdateJobFor( DIWikiPage $subject, CompositePropertyTabl
new DIProperty( PropertyRegistry::SCI_CITE_TEXT )
);

$subjectIdList = $this->getSubjectListFrom(
$subjectIdList = $this->getSubjectIdListFromOrderedTableDiff(
$compositePropertyTableDiffIterator->getOrderedDiffByTable( $tableName )
);

Expand All @@ -93,16 +93,11 @@ public function dispatchUpdateJobFor( DIWikiPage $subject, CompositePropertyTabl
return true;
}

private function getSubjectListFrom( array $orderedDiff ) {
private function getSubjectIdListFromOrderedTableDiff( array $orderedTableDiff ) {

$subjectIdList = array();

// Find out whether a cite text object was altered
foreach ( $orderedDiff as $key => $value ) {

if ( strpos( $key, 'sci_cite_text' ) === false ) {
continue;
}
foreach ( $orderedTableDiff as $key => $value ) {

if ( !isset( $value['delete'] ) ) {
$value['delete'] = array();
Expand Down
31 changes: 31 additions & 0 deletions src/DataValues/CitationFrequencyValue.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
<?php

namespace SCI\DataValues;

use SMWRecordValue as RecordValue;
use SMW\DIProperty;
use SCI\PropertyRegistry;

/**
* @license GNU GPL v2+
* @since 1.1
*
* @author mwjames
*/
class CitationFrequencyValue extends RecordValue {

/**
* @param string $typeid
*/
public function __construct( $typeid = '' ) {
parent::__construct( '_sci_rec' );
}

public function getPropertyDataItems() {
return array(
new DIProperty( PropertyRegistry::SCI_CITE_KEY ),
new DIProperty( PropertyRegistry::SCI_CITE_COUNT )
);
}

}
4 changes: 3 additions & 1 deletion src/DataValues/CitationReferenceValue.php
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,8 @@ protected function parseUserValue( $value ) {
*/
public function getShortWikiText( $linked = null ) {

$this->citationReferencePositionJournal = $this->getExtraneousFunctionFor( '\SCI\CitationReferencePositionJournal' );

// We want the last entry here to get the major/minor
// number that was internally recorded
$referencePosition = $this->citationReferencePositionJournal->findLastReferencePositionEntryFor(
Expand All @@ -110,7 +112,7 @@ public function getShortWikiText( $linked = null ) {
);

if ( $referencePosition === null || $this->m_caption === false ) {
return '';
return $this->m_dataitem->getString();
}

$referenceHash = md5( $this->reference );
Expand Down
Loading