Tuesday, 26 February 2013

Linking ARC and NHMRC Grants in RIF-CS

How to link collections, parties and services to the activity records of the publicly funded projects by ARC and NHMRC?

You can link RIF-CS objects (collections, parties and services) to the publicly funded projects by ARC and NHMRC using the grant PURL (Persistent URL). The PURL for the grant should be added to the related object element of the RIF-CS object.

Example:

   <collection type="collection" dateAccessioned="2012-05-03T00:00:00Z" dateModified="2012-07-23T04:53:03Z">
      <identifier type="uri">http://www.earthbyte.org/Resources/GSAtoday_2012.html</identifier>
....
      <relatedObject>
        <key>http://purl.org/au-research/grants/arc/SR0566892</key>
        <relation type="isOutputOf"/>
      </relatedObject>

where SR0566892 is the Project ID (ARC).

This example is accessible at
http://researchdata.ands.org.au/high-resolution-gplates-sample-data-to-accompany-an-article-in-gsa-today

What if my grant is not listed in RDA?

The PURL for the awarded grants in 2011 and 2012 can be constructed as follows:
For enquiries please email amir.aryani@ands.org.au

Monday, 25 February 2013

Metadata Stores Community News #12

1. Coming events:
2. Acceptance Criteria for Metadata Stores Deliverables - a new Tab in the Metadata Stores blog.
3. The working list of Unexpected behaviours in NLA/Trove can now be found in the Resources tab of the Metadata Stores blog.

Any questions, clarifications or additions please contact Simon Pockley (03) 99020549 simon.pockley@ands.org.au.

Thursday, 21 February 2013

TIM Matching Workshop with Natasha Simons - Wed 27th Feb 2013

Don't miss this TIM Matching Workshop with Natasha Simons on Wednesday 27th February 10:00am (NSW, Vic, Tas) (9:00am Qld) (9:30am SA) (8:30 NT) (7:00am WA) via GoTo Meeting - log-in details below...(places limited to 26).

A recording of this workshop is now available (05-03-2013)

Natasha has special insights into the National Library of Australia's Party record matching processes and will not only be demonstrating how to match real records in real time but will also share her experiences of the way in which Griffith University has gone about managing the matching process.

  • Setting up for harvesting of party records by the NLA
  • Signing up for access to Trove Identities Manager (TIM)
  • Hand matching records using TIM - demonstration
  • Getting NLA identifiers back, putting them in the metadata store and providing these to RDA
  • Summary of experience (successes, problems)

TIM (Trove Identities Manager) is the NLA's tool used for the maintenance of records in the party infrastructure. It enables authorised staff from institutions and organisations who contribute party records to the party infrastructure, to manage these records.

To get the most out of this workshop you should be familiar with TIM training modules and the National Library of Australia's Guide to managing records in the party infrastructure

See also Matching rules - Trove Party Infrastructure rules for auto-matching party record for Trove.

How to access this workshop

1. Please join my meeting.
https://www4.gotomeeting.com/join/902515079

2. Use your microphone and speakers (VoIP) - a headset is recommended.

Or, call in using your telephone.

Dial +61 2 8355 1031
Access Code: 902-515-079
Audio PIN: Shown after joining the meeting

Meeting ID: 902-515-079

Friday, 15 February 2013

Approaching Wednesday morning events for your calendar

  1. Wednesday 20th February 2013: No Data Clinic due to Adelaide ReDBox Community day and a half
  2. Wednesday 27th February 2013: TIM Matching Workshop with Natasha Simons (details to follow)
  3. Wednesday 6th March 2013: ReDBox Intensive with Duncan Dickinson (details to follow)

Thursday, 14 February 2013

ReDBox Community Day and a Half - when you arrive

For those who have registered for the ReDBox Community Day and Half in Adelaide 19th - 20th Feb 2013.

For your convenience, here is a campus map.

If you are bringing a car:

Press for Parking instructions

1. Closest carpark is Car Park 6: short term car park, needs change for meter, $1.20 p.h. or $9.60 per day

2. Next closest carpark is Car Park 1: all day park, meter ticket costs $4.20

3. Both have Permit bays available.

You might also like to use these directions written by a human with a sense of humour:

1. Alight from your helicopter/cab/limousine/personnel carrier in Registry Road, by the bus zone.

2. Turn to face the flagpoles, and walk (NW) towards the large silver and black building (mind the buses!).

3. Take the short staircase down to the footbridge. (If you are afraid of footbridges, internal staircases, or lifts, continue to follow the external staircases down and around to the left behind the building.)

4. Cross the footbridge and continue along through some double glass doors, to the lifts/stairs.

5. Take the lifts/stairs down to Level 1, where you will find the ReDBox Registration Desk, and all your friends.

6. Rooms 1.27 and 1.28 are at the end of a short corridor running (East) off the lobby (past the bathrooms).

There will be signage – but if you get really lost, Amanda is happy for you to contact her on 0450 101 344

Tea/coffee will be available in the mornings to give everyone a kick start J, and morning/afternoon tea and lunch are provided by our sponsors – ANDS.

Please feel free to contact Amanda Nixon with any questions. She is looking forward to seeing you on Tuesday,

(source - edited email from Amanda)

Monday, 11 February 2013

NLA/Trove - unexpected behaviours: Wed 13 Feb 10:00am (Melb Time)

Recent experiences with the NLA/Trove test environment have resulted in a working list of unexpected behaviours. Most need some form of explanation.

We'll be talking through this list on Wednesday 13th February 2013 10:00am (NSW, Vic, Tas) (9:00am Qld) (9:30am SA) (8:30 NT) (7:00am WA) via GoTo Meeting - log-in details below...(places limited to 26).

N.B. This is a working list - please contact Simon Pockley about any amendments or additions you would like to make.

  1. Testing environment

    TIM Beta is the system that NLA use to check the accuracy of records before they are loaded into production. It isn’t a development system. There are limitations to the testing environment.

    a. The test NLA records cannot be deleted, so tests cannot be rerun and new people need to be created and used each time.
    b. NLA ID in TIM Beta resolves to Trove Production rather than Trove Test. This can prompt concern/confusion that valid NLA ID's might be incorrectly being ascribed to real researchers and then published on the web.

    Status - no action. The testing environment in TIM Beta does not impact on the production environment

  2. Access to TIM

    a. Access to TIM is on an individual basis without a generic link to the institution.
    Status - work-around. Establish a convention for managed usernames and passwords, for example, “ANDS.tim.matching01”, “ANDS.tim.matching02”, etc. for multiple users. A single sign-on could be established, remembering that multiple users cannot login to TIM with the same user name and password simultaneously. Generic usernames and email addresses will allow usernames to be reassigned and the password rese
    b. Also having to create individual IDs for Trove and TIM (and making these the same for the test instances of each) is cumbersome compared to AAF access to Research Data Australia.

    Status – No action. Signup to Trove is open to any user of Trove, not just those with access to the Australian Access Federation (AAF) credentials. One username and password is used for both Trove and TIM (production and test).

  3. Transparency of record status

    There’s no indication whether records have been auto-matched or gone into TIM for manual matching.

    Status - work-around. Checking which records are in TIM waiting for matching is a manual process.

  4. Identifiers lose their type

    Identifiers lose the type indicated in the contributor party record e.g. "scopus". Even NLA supplied party identifiers become type="contributor’s ISIL code".  Any identifiers and the key supplied in a contributor party record to Trove become type="contributor’s ISIL code" in the Trove party record.
    Status – low impact - no action

  5. Single element names not harvested

    RIF-CS schema guidelines state that <namePartType> is optional. The reason for variation in how names are presented is that GeoNetWork uses ISO 19115 and/or ANZLIC as its native schema which allows free text entry of names in a single element. RIF-CS party records type person that have origins in either ANZLIC or ISO19115 will be not be harvested by TROVE due to the lack of <namePartType>. Note that this issue impacts on Party person records only. Group records without <namePartType> will be ingested by Trove.
    Status – low impact - no action

  6. NLA converts keys to RDA URLS

    Not all institutions maintain or even create public profile pages. These are used as Source URLS in a Trove record.

    Status - work-around for where there are no source URLs for staff profiles is to use an optional function in Trove. If option is set to ‘on’ the entityID (Source URL) for a university's records in Trove are transformed to ANDS URLS. Trove uses a predefined URL structure to convert keys to a link to RDA and not to the researcher’s web page on the university web site. If set to ‘off’ Trove will use the source URL provided.

    For ReDBox/Mint users:

    Care should be taken that the source URL does not resolve to MINT which requires a logon, and not to a local party record as might be expected. Apply the workaround indicated above, as required.

  7. Vanishing records

    As reported by UQ:  A feed of party records was harvested but some of the records neither appeared in Trove nor in TIM - they vanished without a trace. The harvest operation did not fail, since other party records from the same harvest of the feed appeared in TIM or were automatically matched and appeared in Trove.
    Status - behaviour yet to be replicated

  8. Broken links when Researchers leave an institution (tombstone records)

    There is a need for Policy development regarding Researchers who move or even die. NLA has no control over contributor’s source links. Some universities will use the links to the researchers' local profile as the source URL. However, these links may not be maintained after the researcher leaves the universities. So in the long run there will be a lot of broken links in Trove.

    NLA comment: There is no need for a University to delete the record for a researcher when they move away from that institution. The NLA party identity is a container record and so can contain records from different universities for the same researcher. This is where some descriptive information would be valuable in the record e.g. “Left University of ZX in 2011”.

    Status – low impact - no action

  9. Inconsistent matching algorithms

    Testing Trove's automatic matching algorithms are behaving unexpectedly:

    Issue first raised by Hoylen Sue at UQ - as follows

    Situation #1: party records are harvested for the first time. That is, Trove does not contain any party records with these RIF-CS keys.

    Situation #2:  when these unchanged party records are re-harvested. That is, Trove already has existing party records with these same RIF-CS keys. In this situation UQ has found the two party records for Watson and Drebber pass "Matching Rules Part A" and replace the existing party records -- which is the correct behaviour. But the party record for Holmes does not match and goes into TIM Beta -- which is not correct. It should have been automatically matched just like the other two. Neither us nor NLA have figured out why Holmes is treated differently.

    NLA comment:

    Paul has checked this problem as much as he can and when he sends the record to TIM manually it matches. Essentially the way he's doing it is exactly the way the harvester would send the records to TIM so we have no idea why the records aren't matching when we harvest them. Paul says we need more data to work with and he's reluctant to use your data as then you won't be able to use it. Perhaps if we had lots more test records to work with we might be able to see a pattern. Sorry I don't have a solution or a reason for the records not matching.

    Status - behaviour yet to be replicated

  10. Complex matching rules

    There is an assumption that if local party records contain the NLA identifier as an identifier, then matching will occur automatically, but this is not necessarily the case. The matching rules are complex and deliberately conservative. Records which fail Part A of the matching rules then go to Part B where the NLA identifier is just one of the match points.

    Rules for matching party records: https://wiki.nla.gov.au/download/attachments/24379936/ARDCPIPMatchingRulesSpec-+Ver+2.0++20+Jan+2012.doc?version=1&modificationDate=1333064788000

    From the matching rules:

    The purpose of Part A: to decide if two records are actually the same record
    An incoming record is checked to determine if it has the same contributor ISIL and record ID as a record in Trove.

    If it does, a sanity check is applied. If the incoming record passes the sanity check, it is automatically matched and overlays (replaces) the matching record in Trove. If the incoming record fails the matching rules in Part A then matching rules in Part B are applied.

    The purpose of Part B: to decide if a record should be part of an existing identity.

    The rules specified in Part B are applied to an incoming record that has failed the matching rules specified in Part A.

    An incoming record is checked to determine if it has an NLA persistent identifier. If it does, a sanity check is applied and must be passed for a match to occur.

    Incoming records that fail the matching rules specified in Part A and B are passed by the identity service to the unmatched record queue for human review for matching or new record creation in TIM

    Example:

    A university creates a record for “Davies, Peter” containing the NLA party identifier for the existing Trove record for “Davies, Peter Eric” (the Trove record was contributed by Libraries Australia, ISIL code AU-ANL:PEAU).

    Result: Part A matching rules apply, but neither ISIL code or recordID match so Part B rules are applied.

    In the Part B matching rules the system looks at the incoming record for the NLA party identifier and the NLA’s ISIL (AU-ANL:PEAU) and the NLA party identifier, recorded in control/sources/source/objectXMLWrap.

    If the university added the NLA party identifier for “Davies, Peter Eric” and the NLA agency code to the incoming record, the system would run the sanity check. It would match on Davies, on Peter, but it wouldn’t find Eric in the incoming record and so the name would fail the matching rules. Because there is always the risk that a mistake could be made when the NLA party identifier is manually entered into another organisation’s record, the system has to be certain it’s the same person so it will always check all name parts and all name parts have to match.

    Result: In this example, if the names didn’t automatically match then the record failed the sanity checks at some point.

    Status –investigate. The example above explains why records containing an NLA party identifier will not necessarily match even though they have the same NLA Identifer.

  11. RIF-CS 1.4 mapping to EAC-CPF (Encoded archival context for corporate bodies, persons and families)

    There is currently no separate overarching application profile document for NLA use of EAC-CPF. The NLA maintain mappings from MARC and RIF but these are both in flux at present as the NLA make changes to support RDA and RIF-CS 1.4.The mapping for RIF-CS 1.4 is finished but NLA’s IT is still to finish doing the style sheet to convert the records.

    Status – in progress

  12. Name sequence or name order

    The order in which names are entered is important because the Trove System has its own way of processing first name and last name. Best to enter in natural order i.e. first name: Jane, last name: Smith

    Status – work-around
---------------------------------------------------
1. Please join my meeting.
https://www4.gotomeeting.com/join/902515079

2. Use your microphone and speakers (VoIP) - a headset is recommended.

Or, call in using your telephone.

Dial +61 2 8355 1031
Access Code: 902-515-079
Audio PIN: Shown after joining the meeting

Meeting ID: 902-515-079

Metadata Stores Community News#11

Lot's on...

1. Many thanks to Natasha Simons for a great presentation about the challenges arising from building the Research Hub at Griffith University. We have an audio file of the Presentation and Discussion that will be made available to you as soon as I have processed it and Natasha has approved its release.

2. NLA/Trove - unexpected behaviours. Recent experiences with the NLA/Trove test environment have resulted in a working list of unexpected behaviours. Most need some form of explanation. We'll be talking through this list on Wednesday 13th February 2013 10:00am (NSW, Vic, Tas) (9:00am Qld) (9:30am SA) (8:30 NT) (7:00am WA) via GoTo Meeting https://www4.gotomeeting.com/join/902515079 (more details to follow).

3. 18th February 2013 - Research Profiles Conference 2013
Venue: Melbourne University. Sponsored by Symplectic. Registrations. This one day conference will explore how Australasian Universities as a whole can leverage this capability to gain advantage on an international stage.

Topics covered will include:
  1. The international state of play in research profiling
  2. An assessment of the technologies and standards that assist in the syndication of research information, with a particular emphasis on VIVO as an enabling platform
  3. Local experiences of implementing research profiling systems
  4. A road map for open research data and the university
  5. The future of research reporting and assessment in an era of open data
4. 19th-20th February 2013 ReDBox Community Day and a half. The link to the Agenda has now been updated to include the second day (20th Feb).