Discuss.FOLIO.org is no longer used. This is a static snapshot of the website as of February 14, 2023.

Feedback Request: Knowledge Base and Metadata Planning

marcjohnson

5 Dec '16

Hi everyone.

Over the last few weeks I’ve been trying to describe the scope and direction of the current Knowledge Base and Bibliographic Metadata work that is ongoing at the moment, and have put my thoughts into the following draft document: https://wiki.folio.org/display/~marcjohnson/Knowledge+Base+and+Metadata+-+Scope+and+Domain

I believe it has reached a stage, whilst still very much a work in progress, that feedback from a wider audience would be most appreciated, therefore I would like to ask the members of this community for feedback.

Hugs and thanks,

Marc

Mike_Sollars

7 Dec '16

My FOLIO login credentials do not appear to be giving me access to this document. Perhaps user error on my part?

Charlotte_Whitt

7 Dec '16

Hi Mike,
I can access and read Marc’s document, but you need to be signed in to Discuss.
Best
Charlotte

Kristen_Wilson

7 Dec '16

Hi Marc,

Thanks for sharing this – I’ve been curious to see how the data modeling is coming along. On the whole, I think this is a really good start. The domain model you shared has a lot in common with GOKb and other projects that have been trying to evolve the representation of this space. A few questions/comments:

Use of the term “bibliographic metadata” seems a little off to me in some places in the document. Based on the domain model diagram and other discussion, it sounds like there is a large role for what I think of as “management metadata.” This includes entitlements, subscription models, costs, usage statistics, loans, etc. It’s metadata that describes a library’s interaction with a resource, rather than the content of the resource itself. I’m a little unsure whether this document intends to focus on both bibliographic and management metadata or if the intention is to take the just bibliographic metadata as a starting point.

How exactly does the Instance work? Would a bib record (in any format) be linked to an instance or would it be a type of instance? Is there a concept of a “title” or “work” – e.g., the conceptual work Hamlet of which various instances exist?

What is the Electronic Access record in the domain model meant to represent? Can you explain a little bit more about how this record will relate to a package, subscription, or entitlement.

Best,
Kristen

Ann-Marie

7 Dec '16

It would be helpful to have a definition of Entitlements in the appendix. Is that user-related permissions (e.g. faculty, grad students, students), or resource-related permissions (e.g. single concurrent user, unlimited usage, downloadable, how much can be printed), both, or something else entirely?

marcjohnson

8 Dec '16

Hi Kristen,

Thank you for the thoughts, I very much appreciate the feedback. I’ve tried to answer your questions inline below (with your original comments in italic), I hope some of these thoughts are useful. I have likely only scratched the surface of many of these topics, so please feel free to follow up with more questions or thoughts, especially if something is confusing.

Hugs,

Marc

A few questions/comments:

Use of the term “bibliographic metadata” seems a little off to me in some places in the document.

I’ve gone back and forth around this, the intention was to emphasise that the term knowledge base, when used in this document, is not a generic place for all forms of metadata. I struggled to think of good terminology to describe the subset of metadata we are covering (more on this below).

I would very much like to distinguish between the different contexts of metadata and which concepts relate to each of them (alternative versions of the domain diagram had contexts overlaid upon it).

Based on the domain model diagram and other discussion, it sounds like there is a large role for what I think of as “management metadata.” This includes entitlements, subscription models, costs, usage statistics, loans, etc. It’s metadata that describes a library’s interaction with a resource, rather than the content of the resource itself. I’m a little unsure whether this document intends to focus on both bibliographic and management metadata or if the intention is to take the just bibliographic metadata as a starting point.

The scope of what we are referring to as the Knowledge Base and Metadata work is one of the conversations I hope this document starts.

At the moment, I think it attempts to partially cover aspects of both bibliographic and management metadata, whilst ignoring many aspects of both (e.g. subscription only models a subscription to an electronic package, not to a print journal at the moment).

It might be that this mixed form is a little clunky and confused, it might be that separate documents emerge for each category of metadata. In this document, it was intended to provide context to the domain which the Knowledge Base and Metadata development over the next quarter will likely participate in.

Some of the concepts are intended to represent relationships with other module development, e.g. loan will likely belong in a circulation module and refer to an item.

Two of the challenges that the ongoing development of the Knowledge Base and Metadata suite of modules has is that it is ongoing concurrently with the UX/Analysis processes for many of the contexts which will use it (e.g. the RM SIG Boston face to face meeting) and that it will likely form a linchpin for connecting those contexts.

I think there is a need to group the concepts we are modelling in FOLIO by named contexts. The list below could be a starter for 10 (and covers a broader domain than the document). It is very likely that the granularity of these contexts is incorrect at this point.

Bibliographic - Instance, Subject, Identifier, Material, Person
Physical Inventory - Physical Item
Electronic Access - Electronic Access / Electronic Entitlement, Usage, Access
Circulation - Loan, Fine, Reservation
Subscription - Package, Subscription, Platform
Acquisitions - Order, Invoice

How exactly does the Instance work? Would a bib record (in any format) be linked to an instance or would it be a type of instance? Is there a concept of a “title” or “work” – e.g., the conceptual work Hamlet of which various instances exist?

An instance can be thought of as very similar to title instance in GOKb and is closely related to the concept in Bibframe. At the moment, that is mostly a consolidation of bibliographic metadata across multiple items/entitlements.

My understanding is that many existing record formats tend to have both bibliographic and management / holdings metadata in them, so the relationship to existing records is likely to be to an instance and to other concepts (e.g. physical item).

Ian and I have talked whether to model a work and at present I don’t feel I understand it sufficiently to describe it well and it doesn’t seem necessary for the current scope of upcoming work. I envisage we may introduce it at later point, particularly when we get deeper into the bibliographic metadata aspects of FOLIO.

What is the Electronic Access record in the domain model meant to represent? Can you explain a little bit more about how this record will relate to a package, subscription, or entitlement.

The three concepts of entitlement, electronic access and physical item are intended to be the beginnings of what a unification between electronic and physical resource management could look like.

The name was an intentional, if possibly naive, attempt to avoid the word entitlement for electronic resources, as for physical resources. It could well be that this would be better referred to as an electronic entitlement and in that regard would be very similar to the entitlement concept in KB+, representing the access to an electronic resource that a subscription entitles an organisation.

marcjohnson

8 Dec '16

Hi Ann-Marie,

Thank you for responding and asking the question, I’ll endeavour to answer below.

It would be helpful to have a definition of Entitlements in the appendix. Is that user-related permissions (e.g. faculty, grad students, students), or resource-related permissions (e.g. single concurrent user, unlimited usage, downloadable, how much can be printed), both, or something else entirely?

Filling in the glossary is an ongoing endeavour, I’ve been trying to find existing sources for definitions where I can.

In the context of the document, an entitlement is very much resource focused, effectively representing the access to a resource that an organisation is entitled to.

In the case of electronic resources, it is my understanding that this tends to be via subscriptions to packages and for physical resources, the item itself is an embodiment of that entitlement. This is a little clunky, which maybe demonstrates our progress in experimenting with the idea of an unified model for electronic and physical resources.

I hope this helps a little, this document will be updated as these ideas are refined. Please do feel free to ask follow up questions, especially if this is confusing.

Hugs,

Marc

kmarti

9 Dec '16

Hello Marc,

I’ve been reviewing your metadata model and the comments and wonder if what we need is to create a linked model that will cover both the bibliographic metadata and the management metadata in a single group. Perhaps a combination of RM specialist and Metadata specialists could help information this work. I’m concerned that if we work on this model independent or before considering the descriptive metadata domain, we might not end up with an optimized model, or different functions of FOLIO could be siloed without that being our intention.

Kristen mentioned the idea of work, and I feel that something should be here as well. Are you considering BIBFRAME? I wonder if BF 2.0’s model (https://www.loc.gov/bibframe/docs/bibframe2-model.html) could be incorporated into the model. It would provide that “work”-level object, and also bring in ways to associate description and access to the particular object in question. You already have an instance and item in the model, so I feel like this could be quite positive, and if we could pull it off, revolutionary. We do have some experts on linked data and BF as part of the project, and this could be a wonderful way to build a model from the ground up on linked data principles that would also support the management needs of a library collection.

You mention AACR in a metadata standard of “beyond AACR.” But at this point most libraries have implemented RDA. Do you mean really “beyond RDA?”

For outcomes, will the standard support original descriptive work and setting of access points, or is that something that you see happening outside of FOLIO with that data being brought in through a similar mechanism to copy cataloging? This could be seen as similar to catalogers doing original work in OCLC and then importing the finished record. This could be limiting, however, for non-OCLC (or other utility) members.

How do you want to handle representations for non-book/journals? Do you see managing more complex objects besides individual monographs, such as archival collections?

It might be possible to consider the commonalities of e-books and e-journals, and then add in the elements that are more complex or unique to each (e.g., coverage data for e-journals). E-journals will represent the more complicated use case (but you already know that!). For management purposes, libraries will also want to track their perpetual access entitlements versus their access entitlements, which many times are different.

Looking forward to discussing this further!

peter

9 Dec '16

Hi @Mike_Sollars – there was a permissions problem in the wiki space where that document was located. It required some be signed on with a wiki account. (You can sign up for an account on Wiki and Issues at the Issues account creation page.) That problem has been fixed; let me know if you can’t see the document at this point.

jim.nicholls

10 Dec '16

What are the thoughts around parts? For example, the parts of a kit or a disc accompanying a book. Particularly when parts go missing, and a) as a librarian I want to not loan the missing part and to not expect the missing part be returned and b) as a patron I want to discover kits that have all the parts I need.

marcjohnson

10 Dec '16

Hi Jim,

Thank you for your question. I’ve done my best to begin to answer it, please do feel free to follow up with further thoughts or questions.

What are the thoughts around parts? For example, the parts of a kit or a disc accompanying a book. Particularly when parts go missing, and a) as a librarian I want to not loan the missing part and to not expect the missing part be returned and b) as a patron I want to discover kits that have all the parts I need.

I haven’t really considered parts of resources yet, so I’m not sure I would know where to start.

I imagine that the expected parts might be modelled as part of an instance and the condition / status of each might be a property of an item.

Are the examples you provide important use cases for your library?

marcjohnson

10 Dec '16

Hi Kristin,

Thank you for the thoughts, I very much appreciate the feedback. I’ve tried to answer your questions below. I hope some of these thoughts are useful, please feel free to follow up with more questions or thoughts, especially if something is confusing.

I’ve been reviewing your metadata model and the comments and wonder if what we need is to create a linked model that will cover both the bibliographic metadata and the management metadata in a single group. Perhaps a combination of RM specialist and Metadata specialists could help information this work. I’m concerned that if we work on this model independent or before considering the descriptive metadata domain, we might not end up with an optimized model, or different functions of FOLIO could be siloed without that being our intention.

I agree, both resource management and bibliographic metadata specialists need to inform this work, I would very much like to see participants from both communities contribute to a shared or linked model that is supportive of the needs of both groups.

As I think came up when I was replying to Kristen (@Kristen_Wilson) one of the challenges with the development of the knowledge base and metadata capabilities of FOLIO is that it is happening concurrently with the UX/Analysis process for many of the domains that will need to use it. I’m open to any ways we can improve feedback into these models, whilst this is ongoing.

Kristen mentioned the idea of work, and I feel that something should be here as well. Are you considering BIBFRAME? I wonder if BF 2.0’s model (https://www.loc.gov/bibframe/docs/bibframe2-model.html) could be incorporated into the model. It would provide that “work”-level object, and also bring in ways to associate description and access to the particular object in question. You already have an instance and item in the model, so I feel like this could be quite positive, and if we could pull it off, revolutionary. We do have some experts on linked data and BF as part of the project, and this could be a wonderful way to build a model from the ground up on linked data principles that would also support the management needs of a library collection.

The model in my head (rather than the limited one in the document) has been heavily influenced by what I have managed to read about BIBFRAME, though I still have a lot to learn about it and it would seem that it doesn’t come across strongly in the diagram in the document.

I imagine the concept of a work will appear in this model over time, I am hesitant to introduce it early, as it is easier to introduce something new than replace an existing concept in the system.

what needs could we help fulfil by modelling a work at this point?

Other than work, what concepts from BIBFRAME do you think would be useful to include sooner rather than later?

I would very much like the input and involvement from experts in Linked Data and BIBFRAME. I believe this comes back to your previous point, that this model spans multiple interest groups and contexts.

You mention AACR in a metadata standard of “beyond AACR.” But at this point most libraries have implemented RDA. Do you mean really “beyond RDA?”

This was suggested by an early reviewer, in an attempt to emphasise the need not to couple ourselves to a limited set of metadata representation, there may well be a better way of saying that without referring to a specific standard.

I haven’t seen much about RDA implementations in a library context, are there any examples you can point me towards?

For outcomes, will the standard support original descriptive work and setting of access points, or is that something that you see happening outside of FOLIO with that data being brought in through a similar mechanism to copy cataloging? This could be seen as similar to catalogers doing original work in OCLC and then importing the finished record. This could be limiting, however, for non-OCLC (or other utility) members.

If original descriptive work refers to the creation of new bibliographic metadata, then I believe this model will support that. It is, in part, along with localised customisation, the primary driver between the idea of distinguishing between an internal and external instance. This feeds into an aspect of different cataloging approaches which we have been referring to as copy and reference cataloging.

What are access points in this context?

How do you want to handle representations for non-book/journals? Do you see managing more complex objects besides individual monographs, such as archival collections?

At the moment the model is deliberately minimal and only really represents the beginnings of support for physical and electronic monographs, I think I need to improve the clarity of this in the document. This minimalism is motivated by the desire to elicit feedback from the community to drive the creation / evolution of the model and to only model what we need for our current requirements whilst being careful not to back ourselves into a corner.

I envisage the model evolving to support serials fairly soon and other resource types over time (I don’t feel confident I could model them at the moment), this might include archival collections. Much of this evolution will be driven by the broader analysis/UX process, the priorities of the Bibliographic Metadata and Resource Management SIGs and the development of other modules.

It might be possible to consider the commonalities of e-books and e-journals, and then add in the elements that are more complex or unique to each (e.g., coverage data for e-journals). E-journals will represent the more complicated use case (but you already know that!). For management purposes, libraries will also want to track their perpetual access entitlements versus their access entitlements, which many times are different.

The current model is a first attempt to express potential commonalities between e-books and physical books, in part because the roadmap driving this means we intend to soon connect to an external knowledge base (which will most likely have information about electronic resources) and the desire to also focus on being able to migrate (part of) an existing inventory (which involves physical resources). I suspect both physical and electronic journals will follow after that and that will drive the modelling of those common aspects.

I have had a variety of conversations about how we might model aspects of a resource which vary by type and material (and probably other characteristics), I’m not yet confident to be able to express a generalised model that supports that.

Licensing (which, to me, perpetual access falls under) is a context that we haven’t really started to model yet (though I have had some thoughts following the conversations during the RM group meetings). I don’t know if that will fall under the core knowledge base / metadata development efforts or not, though at the very least it will need to link to resources within it.

I believe there is a balance to be struck here (and I am still learning what that is) between developing a system that can been used sooner rather than later in a limited fashion and in creating a model which allows for continual evolution with as little disruption to other FOLIO modules in the future.

kmarti

11 Dec '16

Hi Marc, I want to make sure members of the nascent Metadata Management SIG see this. I too, don’t know the balance between making a system that can be used right away with a simpler model versus doing more complex modelling work upfront. So it may be possible that we can segment the work and leave the “Work” ala BIBFRAME for later, but I don’t know. I’m trying to consider that as we model Instance, it be harmonized with that Instance means to the metadata community work on BIBFRAME, and thus I start thinking about Work. But I am not a BIBFRAME expert. I’ll try to answer your specific questions here.

We are using RDA here at Chicago and I would assume most libraries at this point that are part of the project are as well. But we are still using MARC, so the changes between RDA and AACR2 seem subtle in MARC.

I’m thinking of the controlled heading in a catalog record that link a particular work to it’s author, subject headings, or other authority record. A means to “access” the record.

marcjohnson

14 Dec '16

Hi Kristin,

Thank you for the continued feedback.

I want to make sure members of the nascent Metadata Management SIG see this.

Yes, I would too, I’m not really aware of where the creation of that SIG is. I would be grateful for any assistance in helping me connect with that group and/or ensuring they have an opportunity to see this document.

I too, don’t know the balance between making a system that can be used right away with a simpler model versus doing more complex modelling work upfront. So it may be possible that we can segment the work and leave the “Work” ala BIBFRAME for later, but I don’t know. I’m trying to consider that as we model Instance, it be harmonized with that Instance means to the metadata community work on BIBFRAME, and thus I start thinking about Work.

Given you and @Kristen_Wilson have both mentioned work as important for framing the concept of an instance (which is intended to be very similar to that of BIBFRAME) then I will happily add it into the model (will update the diagram and terminology accordingly).

Also, if we can confidently model the complex aspects of this domain at this point, I am more than happy to see those models and/or participate in any conversations around them.

I’m keen for us to identify the aspects where we believe there is specific complexity, particularly if we are trying something new or different. There is a tradeoff in planning development effort between getting to a sufficiently working, yet limited, system and in trialling these new or different ideas.

We are using RDA here at Chicago and I would assume most libraries at this point that are part of the project are as well. But we are still using MARC, so the changes between RDA and AACR2 seem subtle in MARC.

I’ve updated the document to include RDA alongside AACR2, I’d appreciate your thoughts on whether this statements still makes sense. Are there other popular bibliographic cataloguing standards? Does FOLIO need an opinion on them?

I’m thinking of the controlled heading in a catalog record that link a particular work to it’s author, subject headings, or other authority record. A means to “access” the record.

Thank you, I was finding this term confusing, as when I think of access it is in the context of being able to access electronic resources hosted outside of the organisation.

My reflection of that, is that access in a bibliographic metadata context is the ability to refer to an accessible authoritative (usually external) representation of a person or subject etc. Is that consistent with your description?

kmarti

15 Dec '16

Hi Marc,

I shared this post with @DoreenH, who is the convener of the Metadata Management SIG. That is just getting underway now. Hopefully we’ll get some good feedback!

I think mention of RDA/AACR2 is probably sufficient for now.

Yes, I think from a linked data perspective, one way to think of the access points is that they would be the object of the subject–predicate–object triple and would have their statements describing them. In a more traditional MARC format, they are the controlled headings, where catalogers have provided specific additional information beyond what is provided in the piece itself to make the record easier to find, using authorities.

marcjohnson

15 Dec '16

Hi Kristin,

Thank you for sharing this with @DoreenH , I look forward to hearing the thoughts and feedback of the Bibliographic Metadata SIG.

It may also be worthwhile sharing it with the Resource Access SIG at some point, do you think that is a good idea?

marcjohnson

15 Dec '16

Hi @Kristen_Wilson @kmarti @jim.nicholls and @Ann-Marie

Thank you for your recent feedback. I have made some minor updates to the document, including:

Introduced work into the domain model
Started improve the distinction between bibliographic and management / administrative metadata
Updated the glossary with an attempt at definitions for all of the terms

Further feedback is more that welcome.

Hugs,

Marc

DoreenH

15 Dec '16

Hi @kmarti and @marcjohnson,

Apologies for coming in late. I haven’t been keeping an eye on the discuss.folio.org site but will do so now.

As Kristin said, the Metadata Management SIG is just gearing up. We hope to have our first online meeting next week and, in the meantime, I’ll direct the members to this conversation so they can begin reading about the goings on in FOLIO and hopefully contribute to this conversation. And I’ll take a closer look at your document, too, and chime in as I’m able.

Take care,
Doreen

JacquieSamples

16 Dec '16

Hello Marc,
I wanted to jump in here after reading Kristen’s comments and your response to the document. I think that what we may need to keep in mind is that there are overlaps in the use of metadata for both description and management. That is, there are multiple roles and uses for the same data set. I have been thinking about this quite a bit recently. A piece of metadata might be used for discovery, management or access, depending on where it is being acted upon. This is true regardless of the format of the resource being sought (electronic or physical).

So, while the sources of our metadata may be either from a bibliographic source or a management source, the same data may be leveraged for patron use or internal management use. In my opinion, it is important to consider both descriptive metadata as well as management metadata when developing the model.

Best,
Jacquie

marcjohnson

19 Dec '16

Hi Doreen,

I’m really glad to hear the the Metadata Management SIG is gearing up. If I can be of any help during that process please let me know.

Otherwise, please feel free to direct people at the document. I’ll endeavour to answer any questions I can. That said, there is no deadline for feedback so please don’t feel like you or other members of the group need to rush or divert attention away from other activities.

Hugs,

Marc

marcjohnson

19 Dec '16

Hi Jacquie,

Thank you for your thoughts.

I agree, there are a number of different contexts which may want to use different categories and aspects of metadata.

We have been thinking about this too, and are conscious to come up with a model which decouples sources and uses as much as possible, so that the providers of the data don’t have to know all of the uses and vice versa.

We have also been thinking about the value of distinguishing between different categories of metadata in the model.

Hugs,

Marc

jim.nicholls

20 Dec '16

This is a problem for us with our current system. Our current system has 2 levels: bib and item. We use one of two techniques to represent items that have multiple parts.

The first technique, which we no longer use, is to use give each part a separate item record. This has several disadvantages including: a messy catalogue; having to make only one part requestable and explaining to clients which part to request; having to make only one part attract fines/demerits to avoid clients getting multiple fines for a single loan.

The second technique, which we now use, is the abuse the volume field on an item. The volume field is meant to have the volume number of a multiple-volume work (eg: v. 1; v. 2; v. 3). We are now using it to indicate what parts are in an item (eg: book + disk; 1 textbook, 1 workbook, 1 teacher’s guide).

If you’re looking for an example of something real to validate the model against, I could suggest try to represent Integrated Chinese. This title has now gone through 4 editions. Each editions has multiple levels and sub-levels. Each level/sub-level consists of a textbook, workbook, character workbook, audio/dvd, and teacher’s resources book.

We have multiple copies of each edition, sublevel and part.

Additionally we have multiple variations on bundling those parts.

We have the parts by themselves and individually loanable.
We have the textbook with the DVD bound inside the back cover.
For Chinese language students, we have kits of the textbook, workbook, character workbook and audio/dvd.
For students studying to become Chinese language teachers, we have kits with all the parts, including the teacher’s resources book.

DoreenH

8 Feb '17

Hi Marc,

I wonder if you would be able to join the Metadata Management SIG for one of our meetings to talk more about this. We’re finally settling in to a regularly scheduled meeting. When we were beginning to touchbase with each other, I did alert them to your interest in getting feedback but I imagine that we’d benefit from hearing from you directly.

We meet on Thursdays at 11:30am EST. Is there anytime in the upcoming weeks that you could join us?

Thanks
Doreen

marcjohnson

9 Feb '17

Hi Doreen,

I’m glad that the Metadata Management SIG has settled into regular meetings. I am very happy to attend one of the upcoming meetings, share some of the discussions we’ve had up to this point and receive some initial feedback.

I’m pretty sure I can attend any of the next few weeks, including today (which might be too short notice).

Hugs,

Marc

marcjohnson

9 Feb '17

Hi Jim,

I’m sorry it has taken me so long to follow up with you. Thank you for sharing this information.

We are likely some way from actually modelling any of this, though it is useful context for considering how various models could work. I appreciate you taking the time to write down the examples, both of the two methods your organisation has used have distinct drawbacks which I suspect is down to the notion of parts not being explicit in the underlying domain model of the system.

I think if we were to model this, we’d want parts to be an explicit structure in the metadata. A thought I’ve been having recently, and it is just a inkling of an idea in my head, is to model a hierarchy, both of resources and the derived information representing the actual items/copies (for want of a better general term) indicating which are loanable (which the circulation system uses to determine availability).

Hugs,

Marc

IDS_Project

20 Apr '17

Have you considered including Resource Sharing in this workflow? Requests for items, whether to purchase or borrow, doesn’t matter to the patron as long as they get the item they want. It could be that the library already owns one or two of the items and, based upon its conspectus, doesn’t want to purchase another copy. A lot of this work flow is already completed, along with a generic conspectus, at http://www.gistlibrary.org. It was originally for ILLiad but could be modified. Also, is Resource Sharing being developed at this point?

Thanks!
Mark

michael.winkler

20 Apr '17

Mark,
Thanks for your outreach. We are definitely interested in resource sharing, since increasingly, libraries depend on the larger community collection to satisfy reader demands for resources. I’m not familiar with GIST, but a quick look is interesting. I’d encourage you to come to our consortia services SIG meeting. Your input could be valuable there, and in the access services area.

Best,
Michael

peter

25 Apr '17

…and to provide a pointer, the consortia SIG meets on Fridays at 10:00 Eastern U.S. time. Connection details are on the SIG’s meeting and notes page. More details about the Consortia SIG are on its ‘about’ page. Would be happy to have you at the next meeting, Mark.