Podcast: Play in new window | Download
Subscribe: Apple Podcasts | Google Podcasts | Stitcher | TuneIn | RSS

Sabine Ocker helps enterprises organize their content.
She uses taxonomies and other content metadata to make sure that customers get the information they need, when they need it.
Sabine’s superpower is her ability to talk about her work in a way that resonates with business decision makers. This ensures that she always has the budget and other support that she needs to do this important work.
We talked about:
- her work as an enterprise information architect focused on taxonomy and metadata
- how her work benefits end users of enterprise content systems
- the importance of focusing on the customer and their needs in your metadata strategy
- some applications and uses of taxonomy metadata
- how taxonomies and ontologies fit into the continuum of metadata
- the taxonomy practice maturity model they use at Comtech
- the importance of tying taxonomy work to a specific business driver
- why it’s important to have an elevator pitch that shows how your taxonomy work supports enterprise KPIs
- the shift of product documentation from a product-support cost item to a marketing tool
- her process for convincing key business stakeholders to support her work
- the importance of matching user intent with the content that satisfies the intent
- the difference between a taxonomy ecosystem and a true ontology management system
- the importance of having a foundation of maturity, governance, collaboration, and other business practices in place to enable taxonomy and other metadata work
Sabine’s bio
Sabine Ocker has 20 years’ of passionately driving content and metadata strategy and execution in structured markup publishing environments. She has worn many hats: taxonomist, enterprise information architect, data analyst, content strategist, trainer, and DTD developer. As a consultant for Comtech Services, Sabine guides clients in defining functional DITA information models and ontologies, drawing on her real-life experiences with customers, vendors, and clients. Away from her consulting work, Sabine lectures and writes about the history of 19th and 20th century photography.
Connect with Sabine online
Video
Here’s the video version of our conversation:
Podcast intro transcript
This is the Content Strategy Insights podcast, episode number 90. Sabine Ocker is an enterprise information architect. She works with taxonomies and other content metadata to help big companies deliver to their customers the right information at the right time. Sabine is not just a great content architect. She’s also an expert at convincing decision makers to support her work. She speaks the business language that managers and executives use, crafting concise stories that persuade decision makers to open their checkbooks.
Interview transcript
Larry:
Hi, everyone. Welcome to episode number 90 of the Content Strategy Insights podcast. I’m really happy today to have with us Sabine Ocker. Sabine is an enterprise information architect. She’s currently working at Comtech, a consulting firm. Welcome, Sabine. Tell the folks a little bit more about what you do there at Comtech and what an enterprise information architect does.
Sabine:
Well, hello, Larry and thank you very much for the invitation to appear here on this prestigious podcast. And hello podcast listeners out there. As Larry mentioned, my name is Sabine Ocker and I’m a longtime enterprise information architect. And the way that I think about that is that usually that means that I might be centered in an engineering organization or in an information development organization. But the work that I do especially around taxonomies and metadatas, really does extend into the tool realm. Having the business drivers of metadata be manifested in the authoring environment and then also be surfaced and exposed and being processable by the delivery platform and the tools and being sort of that facilitator that writes the requirements and checks the functional specifications and determines the values that just sort of make that soup to nuts connection.
Sabine:
And I have done that for companies like EBSCO Publishing. I was a long time metadata maven. That was my name and my title at Sun Microsystems. I also worked for Akamai Technologies where I helped them migrate to a DITA authoring environment, as well as to instantiates taxonomies and also the MathWorks. And I have done taxonomy and metadata enablement work as part of my gig for all of those. And now in Comtech, I teach taxonomy development and I also work with Comtech clients to help them create a taxonomy strategy to help enable their business needs.
Larry:
Great. I got to say, my brain exploded a little bit when you described the scope of working across an enterprise like that. But I think the way you prevent brain explosions like that is by just focus on a specific instance of what you do, because there’s a lot there. You have this sort of enterprise spanning influence, but usually it’s from a customer’s perspective. It’s like, “Hey, I need this particular piece of information. I’m interacting with this, whatever interface, a chatbot or a voice assistant or a website or something like that.” Tell me how your work helps that end user. Tell me sort of a use case out there where there’s somebody needs some content, what do you got for them?
Sabine:
Great. Well and so it is a special skill to be able to take that 30,000 foot business driver and manifest it as use cases and user stories which then are actionable. Both on the content side and then also on the development side. Metadata is an enabling technology. It doesn’t do a gosh darn thing. It has to be applied on the content side and then processed, exposed on the delivery platform side. And so, some very common use cases for why you would want to have a taxonomy present on their delivery platform is of course, could be dynamic content processing, especially behind a login. You log in, we know who you are, we know what products you own, we know the last time you logged in. We just say, “Hello, Ms. User, admin person. Here’s all the stuff that’s been updated since you last joined us. Here’s some pieces of information that could be information development or other places that we think you might be of interest to you,” and et cetera.
Sabine:
That’s a very common thing or it could be just a simple search filter or a facet. Filters allow you to scope your searches. You’re just like, “I don’t care about this. I don’t care about this. I only care about these things. Give me my search results.” Or a filter, sort of the Amazon model where you enter a search term and then you just say, “No, I don’t want that. I don’t want that. I wanted that.” You scope either pre-search or post-search. Those are some common metadata enablements. And so, at Akamai when I joined the company, what my boss told me, “Listen, our business driver is we’re going to get the right content to the right person at the right time for our product experience information.” And I was like, “Yee-haw.” I was very excited because all of that is metadata enabled.
Sabine:
You have to know who the people are, who are your users? What are their roles? What are their goals? And so the right content, whether what products they’re looking at, what product versions they’re looking at, what service packs or whatever the slicing and dicing of that product level stuff, who they are, where they are in the customer journey, what are they trying to do? And then serve them up the information that they need in order to be able to just get an answer to their question, so that they could move on and do something else. We’re not going to say, “Here is this giant tome of a book. Read through it and then you’ll know what you need to do.” It’s like, if you have a question, we’re going to send you to just that piece of information. Those are some of the things that being an enterprise metadata person and taxonomist can sort of facilitate.
Larry:
And that taxonomy is one of the most important pieces of that metadata. But we’re in an increasingly kind of connected and complex world where connecting, will you hit a situation where you have different taxonomies that you need to kind of tie to one another?
Sabine:
Yes. I feel like the word of the day, Larry, is continuum. A taxonomy is actually, it’s not binary. It’s not like you have a taxonomy or you have no taxonomy, it’s actually a continuum of things. We know that a taxonomy has as its enablement, metadata. The metadata on the one end of the spectrum could be just a simple controlled list. It’s just, these are the values that you could pick as the metadata that you want to subscribe. And if you think about that, product names, especially if you’re in the drug industries, you want to make sure that there’s no way any writer could introduce a typo. You make that an enumerated list. And so therefore the content creators just select the value. That’s on the one end, the simple things. Then as you move forward, you stick those metadata values into a tree structure.
Sabine:
You give them parents and grandparents and children. You have a hierarchical structure. And then you might also introduce sort of a thesaurus where you say, “This is the preferred way we call it.” Or this is what this thing, this product used to be called so we’re going to make sure that we can capture that information, that relationship between what it’s called now and what it used to be called. Or maybe there’s this notion of competitor’s terms, that’s another area of a thesaurus. And then you just move on and then you just have these more relationships where it’s like, this is a part of this. This is consumed by this. This is in this format. This solves this problem or achieves this goal. Then you start to have all of these little lists or metadata values or hierarchical structures, which are intertwingled.
Sabine:
They have a relationship with one another and then, poof pow and the far end of that taxonomy spectrum, you have what some people call an ontology. I call an ecosystem of taxonomies. Where you have all of those different hierarchical structures or lists or thesaurus or everything where the relationships between them are known. Therefore, any content object that is tagged with those appropriate metadata can be served up with this complex notion of if this, then this. And if you think about that, it’s a lot of backend work, but on the front end, it’s all transparent to the user. They just know that magically, they just enter a search term or they log in even and they get this content that is just like Goldilocks. It’s the right temperature, it’s the right size, it’s everything. It’s a lot of backend work, but for the positive porridge-eating experience on the front end.
Larry:
You reminded me, it’s kind of like any kind of good design work. It’s the front-end user just doesn’t perceive all that, but there’s a lot. The way you just outlined that, there is a lot of hard, complex, interesting work happening on the backend. Tell me a little bit more about from that continuum. And there’s, I remember I saw a presentation you did at LavaCon and one of the things you didn’t talk about but mentioned was that there’s a notion of taxonomy maturity levels. And does that sort of track to this continuum? Or is that more about practice? I guess does that match up to this?
Sabine:
Yes. At some level it does because you might just say, “I don’t want my writers to enter in text strings.” Because for example, when I was in Sun Microsystems, an initiative that I had to run every single year was to look for “meat data” because people would misspell metadata. And so I sort of feel like the meat data problem is something that is solved by this simple having an enumerated list. And so that generally you don’t get to a taxonomy ecosystems without being fairly mature in your organization. And at Comtech we have a taxonomy maturity model. It’s part of our overall information process maturity model. There are 10 facets of the maturity model. What is the level of support you have in the management? How robust are your processes for maintaining and communicating about your taxonomy? What are the inputs that you’ve got for when you need to make a change to the taxonomy?
Sabine:
And so I do really feel like before you can make your taxonomy into a weightlifter, you really have to have all the processes in place and certainly you need managerial support and you need to know what is it you’re trying to achieve? What is the business driver? And without a business driver, you can have an effective taxonomy, but since it requires so much work in the backend and people, there’s only so far you can take it on your own. My hobby is making my company’s taxonomy better. You can only go so far. And at some point you have to start involving other people. Generally it should be, if you’re not going to have an enterprise taxonomy, then you need a mapping.
Sabine:
Then part of your other, one of the little continuums on your taxonomy thing is that you have a mapping. When I worked at Sun Microsystems, we deployed a B2B commerce site that allowed for the first time ever, users to sort of configure a smallish server, put all the racks and all the things that you needed inside of it and then press “buy.”” There were 17 systems of records that had to supply data to that eCommerce site. My job was to determine what was the system of record for each and every piece of data and also to figure out what was the update schedule? Was it manual? Was it automated? Did there need to be sort of a gatekeeper? And just get all those systems to get the data populated correctly.
Sabine:
And I think that that’s something that is just going to continue to need to be done in terms of understanding you don’t have to have one giant storage system for all of the metadata, but if you’re going to have an enterprise thing that takes multiple systems, you do need a means to map them from one value to the other. Content types is a good example of that.
Larry:
Exactly. And you’re kind of getting it, the overarching use case here is complexity, the need to simplify complexity or to deal with. But the other thing you just said there, I want to get to, because this comes up in every single conversation on this podcast is that it’s mostly about people and convincing them that, no, this is a better solution to that problem. And a lot of what you just said that and something we talked about before we went on the air, was that money and how that figures into this, but all these, how do you get the budget to do? Because to do this kind of complex stuff that enables you to do these important things, costs money and you have to convince somebody to write a check. Tell me your process there.
Sabine:
Well, it’s funny because in my taxonomy development course, I always tell the students, “Listen, you need an elevator pitch. You need to be able to articulate why you cannot do what your enterprise KPIs are without having a fully functional taxonomy and all of its robust supporting processes. And you need to be able to say that two floors on an elevator going up.” I do ask, I do make sure I encourage. I encourage all the students to say that they have to be able to match what is the driver to what is the technology and the content requirements and all of those processes that are needed to be able to put those two things in alignment.
Sabine:
I think using sexy terms, like for the longest time faceted search was you could get people’s attention. Now I think it’s chatbots and virtual assistants. I feel like being able to articulate what under the covers is needed in order for an organization to adopt to drive digital transformation, to drive the adoption of a virtual assistant or a chatbot is something that most certainly is going to get the management at the purse strings level’s attention.
Larry:
Yeah. And that’s another. Is that a buzzword that works, digital transformation? Because I think that’s really come to the fore with COVID I think and I think that, but does that work?
Sabine:
I think it does. Well, I think that there’s so much data that’s available now about, well I come from the perspective of product documentation and the product documentation experience, information development. And it used to be that they were just considered a cost center and not mission critical. Now there’s all this data that says, “By the way, customers, potential customers, access product documentation more than they do to talk to a salesperson while they’re making a decision to either buy a product or upgrade a product.” Therefore we’re realizing that the access to content spans across the just I’m installing something, or I’m buying something, I’m discovering something, I’m optimizing something, I’m troubleshooting something, I’m upgrading. Those are all touch points for content.
Sabine:
And so in order, if you break down from talking about content from features and functions, to a journey, then once you start to think about things in a journey, then you realize, whoa, it is a bigger field. I have to think about support. I have to think about training. I have to think about marketing. And I have to think about how am I effectively going to get the content to the right person at the right level of granularity, depending on where they are in their journey and what they’re looking for.
Larry:
No, it seems like that’s a point at which you have to start pushing management down further down that continuum. Actually, let me take you to task here. What’s your elevator pitch for convincing executives to go from that kind of simpler taxonomy-based okay we’ve solved the meat data problem and we’ve got some basic governance and other controls in place, but boy to do this fancy stuff that you want that new chatbot, we’ve got to get something that looks more like an ontology than a taxonomy to drive this thing. How do you pitch them?
Sabine:
Well, what I usually say is I just try to use business terms. I would say, “In order to be able to more effectively brand our content and our experience that our customers use have with our content on a delivery platform, we need to be able to break down those content silos, break down even the publications into small, granular question answering pieces of information and serve them up, ta-da! To the person who is looking for them. And in order to be able to do that, we have to know who they are, what are their things that they want to do in their role? What products do they have? And then to be able to associate each one of those is a piece of metadata to the appropriately granular content objects.”
Sabine:
And then the simple matter is training a tool to be able to recognize those questions. Who am I? What am I trying to do? And what is my role? And what are my products? As intent and then match it on the backend side to the content and then serve it up on the silver platter to the user. I feel like that’s the pitch. Say, “If we want to continue to expand our role, our market share and increase the value of our brand, then we need to be able to say to the customers, ‘Let us answer your question.’ instead of, ‘Let us point you to this publication that contains the information.'”
Larry:
I think what you just said there, that kind of gets it like the need to move to something, if not a full-blown ontology, something a little more sophisticated than just a simple taxonomy, because to connect that user intent with your content, you got to have there’s meaning in both of those. And this gets into the whole semantic web and semantic technologies that I mean to find this, I have this intent. There’s meaning in that customer half of the equation, then you’ve also with your metadata and taxonomies and other content attributes you’ve identified like, okay, oh, I’ve got content about that. I can satisfy that intent. But that seems like more than just metadata and taxonomy. Ontological practice of doing some harder work to stitch, or can you do sort of a prototypical version without the hard work?
Sabine:
Well, I would say, yeah, and as my grandmother would say, “Yes and no.” There’s always hard work because we’ve had a long history of very long publications that answer if you think about going back to information mapping, which I forgot who the guy is that did that information mapping, but that was from the forties. The principles have been in a place for a long time and it takes a long time to move the albatross or whale or whatever it is into a different direction. You’ve heard me talk about the need for certain pieces of metadata around who is the user? What is their role? And what are the goals that are associated with that particular role? And then even within the goals, what are the tasks that are associated with a particular goal in a particular role that’s maybe a specific person? And then what is their skill level within that?
Sabine:
There is a nesting of ever finer granularity things. And then on the content type-side, I haven’t even talked about that. Certain content types have associated contents. We know what’s in an integration guide. We know what’s in an implementation guide. We know what’s in an upgrade guide. We know what’s in a user guide, service guide, troubleshooting manual, all of those things have meaning. But the content that they contain are the answers. Being able to have a smaller level of granularity of content that’s associated, even at a content-type level is also an important part of that mix. And then the date. We always want to make sure that the content we’re serving up is technically accurate, complete, and the most current. The dates that are associated around the content are also an important part of this mix, slurry. Slurry that we’re creating.
Larry:
A slurry, I love that. I’ve never heard structured content referred to as slurry, but it’s kind of like that. You have this bucket of stuff and you’re like, I need to dip in and get the right thing out. Yeah. And that sort of gets at the implication of all this is that there is structured content there. And that’s a whole other side of this equation that we haven’t really talked about that much. And to what you were just saying, that the content, the exact artifact that you create for that interaction may draw on different parts of that. This is the whole point of structuring content so you can put it together in different ways. Does this magic chatbot machine of yours, how does it do that?
Sabine:
Well, so chatbots work by determining intent. They need to know who are you? What are you trying to do? What are you trying to do it with? Whatever that’s product or whatever. And then what is your question? And then too, once they’ve assessed those four or five pieces of mission critical intent, then they have to be able to match. And I’m using sort of people terms, but this is sort of machine processing of course. They have to be able to match the answers to all of those questions, to some finite number of pieces of content. Ideally it should be one, but maybe they might have to ask a few more questions.
Sabine:
For example, one client that I worked with at Comtech, they also had the additional sort of layer of, “What is your consumption method? How do you want to consume this information? Do you want to listen to it? Do you want to watch it? Do you want to read it?”” In addition to their chatbot needing to be able to determine all these intenty things then the final question is well, okay how do you want to consume that information? Do you want to watch it? Do you want to read it? Do you want to listen to it? And so I love that. And in terms of the metadata, that’s one of easiest ones because you have pretty strong MIME typing. And so you could just associate the consumption method with the MIME type. But I think that hopefully the takeaway is this is not fast.
Larry:
Exactly. No, I’m getting that. Hey, a quick, I just want to do a quick terminology check. You use the term MIME. Can you define that for the folks? What a MIME type is.
Sabine:
Yeah. Every piece of information has an extension on it that tells the processor something about what to do with it. JPEGs and BMPs and SVGs, we know that those are images and so processors know what to do with the images definitely based on the metadata that’s associated with that object. We know that in terms of HTML files, XML files, DITA files, all of those are files, contents with MIME types that processors know what to do with. And then there’s MPGs for sound and or video. This is what I’m talking about. Some relationships are direct. This thing is tagged with this thing that tells us about it. Some relationships are indirect. We know that all of these lists of these image types are images. If you select, look at, then we just know we have to just find the content that we’re going to find, it’s going to have one of those four or five values. That’s a sort of indirect association.
Larry:
Right. And that, I guess, is this where we get into going from taxonomy to ontology because an indirect connection, taxonomy is not really good at that, but an ontological engine of some kind is, right?
Sabine:
Right. There’s ontology and then there’s ontology. I tend to think of things as being sort of what I call taxonomy ecosystem that you have these disparate pieces of taxonomy on your continuum of the taxonomy complexities that you’ve associated with your content. And some of them have relationships, products have product versions, et cetera. That’s one type of ontology where the relationships could either be explicit, so they’re defined and explicit in their tagging or they’re inferred or indirect like the example that I just gave about the images. A true ontology management system says that we can make inferences a lot more sophisticatedly than the example I gave about the image MIME types are things that you want to look at.
Sabine:
The problem exists with that sort of ecosystem environment is that there are very few tools that do that effectively. And there’s very few delivery platforms that are able to expose the outcomes of those relationships or inferred relationships. Metadata again, doesn’t do anything. It just is enabling. You have to have some means to be able to expose it to your users and delivery platforms are not very good. And if you think about that, even within a controlled space where you’re having your users log in, where you have a fair amount of control and you can really do a good job with your search experience. Now think about you don’t have your content is publicly accessible so you’re dealing with commercially available search engines like Google, which don’t really give you that many options on how to optimize your search engine and no search engine knows how to utilize OWL-based or any sort of sophisticated semantic language.
Sabine:
To expose those inferences and those indirect relationships, that just doesn’t exist. And if you think about it, there’s so much of the dark web, do you really want to have some system beyond your control, associating your curated content with something else that you don’t know that it’s going to happen? There’s some kinks to be worked out about that. And I think that’s why you’re seeing that there’s not very many companies that are clamoring saying, “We need the web to be able to deal with our ontology and to be able to effectively map and manage those relationships between our content objects.”
Larry:
Well, we knew this before we went on the air, but you just opened two whole cans of worms that could be a whole other episodes about governance issues to that. And also the tooling, the inadequacy of the tooling that we’ve got right now. But so I have a feeling we might be talking again. But anyway, Hey, Sabine, we’re coming up on time. These always go so quickly. And especially in a topic like this, where we could probably talk literally all day, but I want to wrap things up, but before we do, is there anything last, anything that’s come up in the conversation that you want to make sure we talk about?
Sabine:
Yeah, well, I want to say one thing and I want to put a giant caveat because I can see just in my mind, I can see the comments that are going to come as a result of this. I didn’t say that there weren’t any tools that are available, but they’re not commercially viable. They’re not regularly available. And I’m sure that there are some sort of brain trusts that are creating a natural language processors and ontology-aware delivery platform but those are small in number. What I’m talking about is something that every company, small, medium and large size companies will have access to. That’s an important thing that I would just want to classify that when we’re talking about availability, it’s availability with a capital A and not with a lowercase a.
Larry:
Got it.
Sabine:
And I would say that maturity, governance, processes, collaboration at the enterprise level are all of the, that’s the episode of Star Wars that was before this one. Those are the things that we should talk about too because they need to be in place before you can get to this episode.
Larry:
Okay. We’ll do the prequels in, yeah, I won’t wait 20 years, but no but you’re right, because there’s a whole, because you kind of got to this, I think that the theme of this is that you’ve been doing some of the standards work and the basics of this, the scientific foundation that makes all this step work well-known, well-established. People have been working on for 20 plus years. And then some practices have emerged that people like you are getting good at and better at all the time and now the tools are coming along to help you. And it’s going to be fun watching that journey.
Sabine:
Yeah. I think it’s the use cases too. I feel like the virtual assistants, I can hardly wait to see where we’ll be with 15 years. We’ll never be speaking to a human again, in any of our interactions. We’ll be interacting within. That’s going to be some exciting content changes and content strategy changes and modifications using metadata in order to make that happen. And I’m excited about it.
Larry:
I am too. And that’s kind of my intent in having this sort of thread to the podcast. And that’s going to be more of that in season four. We’ll be doing much more of this stuff, because I think as content strategists, we need to be, this is coming, we need to be ready. We can’t just abdicate this to the technologists. We’ve got to be there getting our content stuff in as well.
Sabine:
Yeah, that’s right. That’s right. Right. Because that’s the experience. It’s the content. The other stuff is the enablement, but the experience is the content and that right size of chunk that answers a question and gets your user back onto their next thing.
Larry:
And that’s another whole episode about the user experience angle. Well, thanks so much Sabine. This has been great. Hey, one very last thing. What’s the best way for people to connect with you? Are you active on social media? Or where do you like to connect with folks?
Sabine:
Well so I’m pretty active on Slack and I’m not sure how you would engage with me on a Slack channel. I also am very active on LinkedIn so please reach out to me, Sabine Ocker, S-A-B-I-N-E O-C-K-E-R and I’d be happy to engage in dialogues and answer your questions or point you to resources, all those fun set of things.
Larry:
Great. And I’ll put that stuff in the show notes as well.
Sabine:
Great. Okay.
Larry:
Well, thanks so much, Sabine. I really enjoyed this. Fun conversation.
Sabine:
Well, thank you for having me, Larry. I really, really enjoyed it and I enjoyed our conversation very much.
Leave a Reply