Feedback Request: Knowledge Base and Metadata Planning


#1

Hi everyone.

Over the last few weeks I’ve been trying to describe the scope and direction of the current Knowledge Base and Bibliographic Metadata work that is ongoing at the moment, and have put my thoughts into the following draft document: https://wiki.folio.org/display/~marcjohnson/Knowledge+Base+and+Metadata+-+Scope+and+Domain

I believe it has reached a stage, whilst still very much a work in progress, that feedback from a wider audience would be most appreciated, therefore I would like to ask the members of this community for feedback.

Hugs and thanks,

Marc


Feedback Request: Knowledge Base And Bibliographic Metadata document feedback from Metadata Management SIG
#2

My FOLIO login credentials do not appear to be giving me access to this document. Perhaps user error on my part?


#3

Hi Mike,
I can access and read Marc’s document, but you need to be signed in to Discuss.
Best
Charlotte


#4

Hi Marc,

Thanks for sharing this – I’ve been curious to see how the data modeling is coming along. On the whole, I think this is a really good start. The domain model you shared has a lot in common with GOKb and other projects that have been trying to evolve the representation of this space. A few questions/comments:

Use of the term “bibliographic metadata” seems a little off to me in some places in the document. Based on the domain model diagram and other discussion, it sounds like there is a large role for what I think of as “management metadata.” This includes entitlements, subscription models, costs, usage statistics, loans, etc. It’s metadata that describes a library’s interaction with a resource, rather than the content of the resource itself. I’m a little unsure whether this document intends to focus on both bibliographic and management metadata or if the intention is to take the just bibliographic metadata as a starting point.

How exactly does the Instance work? Would a bib record (in any format) be linked to an instance or would it be a type of instance? Is there a concept of a “title” or “work” – e.g., the conceptual work Hamlet of which various instances exist?

What is the Electronic Access record in the domain model meant to represent? Can you explain a little bit more about how this record will relate to a package, subscription, or entitlement.

Best,
Kristen


#5

It would be helpful to have a definition of Entitlements in the appendix. Is that user-related permissions (e.g. faculty, grad students, students), or resource-related permissions (e.g. single concurrent user, unlimited usage, downloadable, how much can be printed), both, or something else entirely?


#6

Hi Kristen,

Thank you for the thoughts, I very much appreciate the feedback. I’ve tried to answer your questions inline below (with your original comments in italic), I hope some of these thoughts are useful. I have likely only scratched the surface of many of these topics, so please feel free to follow up with more questions or thoughts, especially if something is confusing.

Hugs,

Marc

A few questions/comments:

Use of the term “bibliographic metadata” seems a little off to me in some places in the document.

I’ve gone back and forth around this, the intention was to emphasise that the term knowledge base, when used in this document, is not a generic place for all forms of metadata. I struggled to think of good terminology to describe the subset of metadata we are covering (more on this below).

I would very much like to distinguish between the different contexts of metadata and which concepts relate to each of them (alternative versions of the domain diagram had contexts overlaid upon it).

Based on the domain model diagram and other discussion, it sounds like there is a large role for what I think of as “management metadata.” This includes entitlements, subscription models, costs, usage statistics, loans, etc. It’s metadata that describes a library’s interaction with a resource, rather than the content of the resource itself. I’m a little unsure whether this document intends to focus on both bibliographic and management metadata or if the intention is to take the just bibliographic metadata as a starting point.

The scope of what we are referring to as the Knowledge Base and Metadata work is one of the conversations I hope this document starts.

At the moment, I think it attempts to partially cover aspects of both bibliographic and management metadata, whilst ignoring many aspects of both (e.g. subscription only models a subscription to an electronic package, not to a print journal at the moment).

It might be that this mixed form is a little clunky and confused, it might be that separate documents emerge for each category of metadata. In this document, it was intended to provide context to the domain which the Knowledge Base and Metadata development over the next quarter will likely participate in.

Some of the concepts are intended to represent relationships with other module development, e.g. loan will likely belong in a circulation module and refer to an item.

Two of the challenges that the ongoing development of the Knowledge Base and Metadata suite of modules has is that it is ongoing concurrently with the UX/Analysis processes for many of the contexts which will use it (e.g. the RM SIG Boston face to face meeting) and that it will likely form a linchpin for connecting those contexts.

I think there is a need to group the concepts we are modelling in FOLIO by named contexts. The list below could be a starter for 10 (and covers a broader domain than the document). It is very likely that the granularity of these contexts is incorrect at this point.

Bibliographic - Instance, Subject, Identifier, Material, Person
Physical Inventory - Physical Item
Electronic Access - Electronic Access / Electronic Entitlement, Usage, Access
Circulation - Loan, Fine, Reservation
Subscription - Package, Subscription, Platform
Acquisitions - Order, Invoice

How exactly does the Instance work? Would a bib record (in any format) be linked to an instance or would it be a type of instance? Is there a concept of a “title” or “work” – e.g., the conceptual work Hamlet of which various instances exist?

An instance can be thought of as very similar to title instance in GOKb and is closely related to the concept in Bibframe. At the moment, that is mostly a consolidation of bibliographic metadata across multiple items/entitlements.

My understanding is that many existing record formats tend to have both bibliographic and management / holdings metadata in them, so the relationship to existing records is likely to be to an instance and to other concepts (e.g. physical item).

Ian and I have talked whether to model a work and at present I don’t feel I understand it sufficiently to describe it well and it doesn’t seem necessary for the current scope of upcoming work. I envisage we may introduce it at later point, particularly when we get deeper into the bibliographic metadata aspects of FOLIO.

What is the Electronic Access record in the domain model meant to represent? Can you explain a little bit more about how this record will relate to a package, subscription, or entitlement.

The three concepts of entitlement, electronic access and physical item are intended to be the beginnings of what a unification between electronic and physical resource management could look like.

The name was an intentional, if possibly naive, attempt to avoid the word entitlement for electronic resources, as for physical resources. It could well be that this would be better referred to as an electronic entitlement and in that regard would be very similar to the entitlement concept in KB+, representing the access to an electronic resource that a subscription entitles an organisation.


#7

Hi Ann-Marie,

Thank you for responding and asking the question, I’ll endeavour to answer below.

It would be helpful to have a definition of Entitlements in the appendix. Is that user-related permissions (e.g. faculty, grad students, students), or resource-related permissions (e.g. single concurrent user, unlimited usage, downloadable, how much can be printed), both, or something else entirely?

Filling in the glossary is an ongoing endeavour, I’ve been trying to find existing sources for definitions where I can.

In the context of the document, an entitlement is very much resource focused, effectively representing the access to a resource that an organisation is entitled to.

In the case of electronic resources, it is my understanding that this tends to be via subscriptions to packages and for physical resources, the item itself is an embodiment of that entitlement. This is a little clunky, which maybe demonstrates our progress in experimenting with the idea of an unified model for electronic and physical resources.

I hope this helps a little, this document will be updated as these ideas are refined. Please do feel free to ask follow up questions, especially if this is confusing.

Hugs,

Marc


#8

Hello Marc,

I’ve been reviewing your metadata model and the comments and wonder if what we need is to create a linked model that will cover both the bibliographic metadata and the management metadata in a single group. Perhaps a combination of RM specialist and Metadata specialists could help information this work. I’m concerned that if we work on this model independent or before considering the descriptive metadata domain, we might not end up with an optimized model, or different functions of FOLIO could be siloed without that being our intention.

Kristen mentioned the idea of work, and I feel that something should be here as well. Are you considering BIBFRAME? I wonder if BF 2.0’s model (https://www.loc.gov/bibframe/docs/bibframe2-model.html) could be incorporated into the model. It would provide that “work”-level object, and also bring in ways to associate description and access to the particular object in question. You already have an instance and item in the model, so I feel like this could be quite positive, and if we could pull it off, revolutionary. We do have some experts on linked data and BF as part of the project, and this could be a wonderful way to build a model from the ground up on linked data principles that would also support the management needs of a library collection.

You mention AACR in a metadata standard of “beyond AACR.” But at this point most libraries have implemented RDA. Do you mean really “beyond RDA?”

For outcomes, will the standard support original descriptive work and setting of access points, or is that something that you see happening outside of FOLIO with that data being brought in through a similar mechanism to copy cataloging? This could be seen as similar to catalogers doing original work in OCLC and then importing the finished record. This could be limiting, however, for non-OCLC (or other utility) members.

How do you want to handle representations for non-book/journals? Do you see managing more complex objects besides individual monographs, such as archival collections?

It might be possible to consider the commonalities of e-books and e-journals, and then add in the elements that are more complex or unique to each (e.g., coverage data for e-journals). E-journals will represent the more complicated use case (but you already know that!). For management purposes, libraries will also want to track their perpetual access entitlements versus their access entitlements, which many times are different.

Looking forward to discussing this further!


#9

Hi @Mike_Sollars – there was a permissions problem in the wiki space where that document was located. It required some be signed on with a wiki account. (You can sign up for an account on Wiki and Issues at the Issues account creation page.) That problem has been fixed; let me know if you can’t see the document at this point.


#10

What are the thoughts around parts? For example, the parts of a kit or a disc accompanying a book. Particularly when parts go missing, and a) as a librarian I want to not loan the missing part and to not expect the missing part be returned and b) as a patron I want to discover kits that have all the parts I need.


#12

Hi Jim,

Thank you for your question. I’ve done my best to begin to answer it, please do feel free to follow up with further thoughts or questions.

What are the thoughts around parts? For example, the parts of a kit or a disc accompanying a book. Particularly when parts go missing, and a) as a librarian I want to not loan the missing part and to not expect the missing part be returned and b) as a patron I want to discover kits that have all the parts I need.

I haven’t really considered parts of resources yet, so I’m not sure I would know where to start.

I imagine that the expected parts might be modelled as part of an instance and the condition / status of each might be a property of an item.

Are the examples you provide important use cases for your library?


#13

Hi Kristin,

Thank you for the thoughts, I very much appreciate the feedback. I’ve tried to answer your questions below. I hope some of these thoughts are useful, please feel free to follow up with more questions or thoughts, especially if something is confusing.

I’ve been reviewing your metadata model and the comments and wonder if what we need is to create a linked model that will cover both the bibliographic metadata and the management metadata in a single group. Perhaps a combination of RM specialist and Metadata specialists could help information this work. I’m concerned that if we work on this model independent or before considering the descriptive metadata domain, we might not end up with an optimized model, or different functions of FOLIO could be siloed without that being our intention.

I agree, both resource management and bibliographic metadata specialists need to inform this work, I would very much like to see participants from both communities contribute to a shared or linked model that is supportive of the needs of both groups.

As I think came up when I was replying to Kristen (@Kristen_Wilson) one of the challenges with the development of the knowledge base and metadata capabilities of FOLIO is that it is happening concurrently with the UX/Analysis process for many of the domains that will need to use it. I’m open to any ways we can improve feedback into these models, whilst this is ongoing.

Kristen mentioned the idea of work, and I feel that something should be here as well. Are you considering BIBFRAME? I wonder if BF 2.0’s model (https://www.loc.gov/bibframe/docs/bibframe2-model.html) could be incorporated into the model. It would provide that “work”-level object, and also bring in ways to associate description and access to the particular object in question. You already have an instance and item in the model, so I feel like this could be quite positive, and if we could pull it off, revolutionary. We do have some experts on linked data and BF as part of the project, and this could be a wonderful way to build a model from the ground up on linked data principles that would also support the management needs of a library collection.

The model in my head (rather than the limited one in the document) has been heavily influenced by what I have managed to read about BIBFRAME, though I still have a lot to learn about it and it would seem that it doesn’t come across strongly in the diagram in the document.

I imagine the concept of a work will appear in this model over time, I am hesitant to introduce it early, as it is easier to introduce something new than replace an existing concept in the system.

what needs could we help fulfil by modelling a work at this point?

Other than work, what concepts from BIBFRAME do you think would be useful to include sooner rather than later?

I would very much like the input and involvement from experts in Linked Data and BIBFRAME. I believe this comes back to your previous point, that this model spans multiple interest groups and contexts.

You mention AACR in a metadata standard of “beyond AACR.” But at this point most libraries have implemented RDA. Do you mean really “beyond RDA?”

This was suggested by an early reviewer, in an attempt to emphasise the need not to couple ourselves to a limited set of metadata representation, there may well be a better way of saying that without referring to a specific standard.

I haven’t seen much about RDA implementations in a library context, are there any examples you can point me towards?

For outcomes, will the standard support original descriptive work and setting of access points, or is that something that you see happening outside of FOLIO with that data being brought in through a similar mechanism to copy cataloging? This could be seen as similar to catalogers doing original work in OCLC and then importing the finished record. This could be limiting, however, for non-OCLC (or other utility) members.

If original descriptive work refers to the creation of new bibliographic metadata, then I believe this model will support that. It is, in part, along with localised customisation, the primary driver between the idea of distinguishing between an internal and external instance. This feeds into an aspect of different cataloging approaches which we have been referring to as copy and reference cataloging.

What are access points in this context?

How do you want to handle representations for non-book/journals? Do you see managing more complex objects besides individual monographs, such as archival collections?

At the moment the model is deliberately minimal and only really represents the beginnings of support for physical and electronic monographs, I think I need to improve the clarity of this in the document. This minimalism is motivated by the desire to elicit feedback from the community to drive the creation / evolution of the model and to only model what we need for our current requirements whilst being careful not to back ourselves into a corner.

I envisage the model evolving to support serials fairly soon and other resource types over time (I don’t feel confident I could model them at the moment), this might include archival collections. Much of this evolution will be driven by the broader analysis/UX process, the priorities of the Bibliographic Metadata and Resource Management SIGs and the development of other modules.

It might be possible to consider the commonalities of e-books and e-journals, and then add in the elements that are more complex or unique to each (e.g., coverage data for e-journals). E-journals will represent the more complicated use case (but you already know that!). For management purposes, libraries will also want to track their perpetual access entitlements versus their access entitlements, which many times are different.

The current model is a first attempt to express potential commonalities between e-books and physical books, in part because the roadmap driving this means we intend to soon connect to an external knowledge base (which will most likely have information about electronic resources) and the desire to also focus on being able to migrate (part of) an existing inventory (which involves physical resources). I suspect both physical and electronic journals will follow after that and that will drive the modelling of those common aspects.

I have had a variety of conversations about how we might model aspects of a resource which vary by type and material (and probably other characteristics), I’m not yet confident to be able to express a generalised model that supports that.

Licensing (which, to me, perpetual access falls under) is a context that we haven’t really started to model yet (though I have had some thoughts following the conversations during the RM group meetings). I don’t know if that will fall under the core knowledge base / metadata development efforts or not, though at the very least it will need to link to resources within it.

I believe there is a balance to be struck here (and I am still learning what that is) between developing a system that can been used sooner rather than later in a limited fashion and in creating a model which allows for continual evolution with as little disruption to other FOLIO modules in the future.


#14

Hi Marc, I want to make sure members of the nascent Metadata Management SIG see this. I too, don’t know the balance between making a system that can be used right away with a simpler model versus doing more complex modelling work upfront. So it may be possible that we can segment the work and leave the “Work” ala BIBFRAME for later, but I don’t know. I’m trying to consider that as we model Instance, it be harmonized with that Instance means to the metadata community work on BIBFRAME, and thus I start thinking about Work. But I am not a BIBFRAME expert. I’ll try to answer your specific questions here.

We are using RDA here at Chicago and I would assume most libraries at this point that are part of the project are as well. But we are still using MARC, so the changes between RDA and AACR2 seem subtle in MARC.

I’m thinking of the controlled heading in a catalog record that link a particular work to it’s author, subject headings, or other authority record. A means to “access” the record.


#15

Hi Kristin,

Thank you for the continued feedback.

I want to make sure members of the nascent Metadata Management SIG see this.

Yes, I would too, I’m not really aware of where the creation of that SIG is. I would be grateful for any assistance in helping me connect with that group and/or ensuring they have an opportunity to see this document.

I too, don’t know the balance between making a system that can be used right away with a simpler model versus doing more complex modelling work upfront. So it may be possible that we can segment the work and leave the “Work” ala BIBFRAME for later, but I don’t know. I’m trying to consider that as we model Instance, it be harmonized with that Instance means to the metadata community work on BIBFRAME, and thus I start thinking about Work.

Given you and @Kristen_Wilson have both mentioned work as important for framing the concept of an instance (which is intended to be very similar to that of BIBFRAME) then I will happily add it into the model (will update the diagram and terminology accordingly).

Also, if we can confidently model the complex aspects of this domain at this point, I am more than happy to see those models and/or participate in any conversations around them.

I’m keen for us to identify the aspects where we believe there is specific complexity, particularly if we are trying something new or different. There is a tradeoff in planning development effort between getting to a sufficiently working, yet limited, system and in trialling these new or different ideas.

We are using RDA here at Chicago and I would assume most libraries at this point that are part of the project are as well. But we are still using MARC, so the changes between RDA and AACR2 seem subtle in MARC.

I’ve updated the document to include RDA alongside AACR2, I’d appreciate your thoughts on whether this statements still makes sense. Are there other popular bibliographic cataloguing standards? Does FOLIO need an opinion on them?

I’m thinking of the controlled heading in a catalog record that link a particular work to it’s author, subject headings, or other authority record. A means to “access” the record.

Thank you, I was finding this term confusing, as when I think of access it is in the context of being able to access electronic resources hosted outside of the organisation.

My reflection of that, is that access in a bibliographic metadata context is the ability to refer to an accessible authoritative (usually external) representation of a person or subject etc. Is that consistent with your description?


#16

Hi Marc,

I shared this post with @DoreenH, who is the convener of the Metadata Management SIG. That is just getting underway now. Hopefully we’ll get some good feedback!

I think mention of RDA/AACR2 is probably sufficient for now.

Yes, I think from a linked data perspective, one way to think of the access points is that they would be the object of the subject–predicate–object triple and would have their statements describing them. In a more traditional MARC format, they are the controlled headings, where catalogers have provided specific additional information beyond what is provided in the piece itself to make the record easier to find, using authorities.


#17

Hi Kristin,

Thank you for sharing this with @DoreenH , I look forward to hearing the thoughts and feedback of the Bibliographic Metadata SIG.

It may also be worthwhile sharing it with the Resource Access SIG at some point, do you think that is a good idea?


#18

Hi @Kristen_Wilson @kmarti @jim.nicholls and @Ann-Marie

Thank you for your recent feedback. I have made some minor updates to the document, including:

  • Introduced work into the domain model
  • Started improve the distinction between bibliographic and management / administrative metadata
  • Updated the glossary with an attempt at definitions for all of the terms

Further feedback is more that welcome.

Hugs,

Marc


#19

Hi @kmarti and @marcjohnson,

Apologies for coming in late. I haven’t been keeping an eye on the discuss.folio.org site but will do so now.

As Kristin said, the Metadata Management SIG is just gearing up. We hope to have our first online meeting next week and, in the meantime, I’ll direct the members to this conversation so they can begin reading about the goings on in FOLIO and hopefully contribute to this conversation. And I’ll take a closer look at your document, too, and chime in as I’m able.

Take care,
Doreen


#20

Hello Marc,
I wanted to jump in here after reading Kristen’s comments and your response to the document. I think that what we may need to keep in mind is that there are overlaps in the use of metadata for both description and management. That is, there are multiple roles and uses for the same data set. I have been thinking about this quite a bit recently. A piece of metadata might be used for discovery, management or access, depending on where it is being acted upon. This is true regardless of the format of the resource being sought (electronic or physical).

So, while the sources of our metadata may be either from a bibliographic source or a management source, the same data may be leveraged for patron use or internal management use. In my opinion, it is important to consider both descriptive metadata as well as management metadata when developing the model.

Best,
Jacquie


#21

Hi Doreen,

I’m really glad to hear the the Metadata Management SIG is gearing up. If I can be of any help during that process please let me know.

Otherwise, please feel free to direct people at the document. I’ll endeavour to answer any questions I can. That said, there is no deadline for feedback so please don’t feel like you or other members of the group need to rush or divert attention away from other activities.

Hugs,

Marc