Podcast: Play in new window | Download
Subscribe: Apple Podcasts | Google Podcasts | Stitcher | TuneIn | RSS

Patrick Bosek helps businesses author and manage content that is structured so that it can be used for many purposes, not just one-off publications.
Structuring documentation this way creates intelligent content that both addresses current customer needs and anticipates future demands.
The DITA standard underlies this approach to creating and managing the smart, adaptable content that modern businesses need to complete their digital transformation.
We talked about:
- the benefits of componentized content over systems that work from a page paradigm
- the benefits of a future-looking approach to content distribution
- the difference between operating from an “authored” perspective and a “published” perspective
- the use of DITA and XML in structuring enterprise content
- the benefits of DITA, a stable, battle-tested standard
- the differences between how you store comment and how you format it for delivery on the web
- his thoughts on “digital proximity” – the idea that shoes the importance of putting your content to the user in a way that is sensitive to their context in the moment and in the language they speak
- how componentizing your content future-proofs it for presentation in new contexts
- his take on digital transformation as it pertains to content
Patrick’s bio
Patrick Bosek is a co-founder of Jorsek Inc, makers of easyDITA. Since beginning with Jorsek in 2005 Patrick, has worked on a wide range of projects all focused on improving authoring, production, and distribution of content. Most recently, his primary focus has been empowering the users of easyDITA and generally advancing the product documentation industry.
Follow Patrick on social media
Video
Here’s the video version of our conversation:
Podcast intro transcript
This is the Content Strategy Insights podcast, episode number 98. Mention the word “content,” and most people immediately think of publications like books, web pages, and social media posts. Patrick Bosek is quick to point out that we live in a post-publication content world. Yes, books and web pages still exist, but nowadays your content needs to be created and shared in ways that let you use and re-use it intelligently. The DITA standard drives his approach to managing the smart, adaptable content that modern enterprises need.
Interview transcript
Larry:
Hi, everyone. Welcome to Episode Number 98 of the Content Strategy Insights podcast. I’m really happy today to have with us Patrick Bosek. Patrick is the CEO of a company called Jorsek which makes easyDITA and other products. Welcome, Patrick. Tell the folks a little bit more about your role there at Jorsek and a little bit about easyDITA.
Patrick:
Yeah, sure. Thanks, Larry. So easyDITA is our primary product. It’s a component content management system. It’s really focused on product reference and knowledge content. So you can think of that as being a full range of things, from knowledge bases, help articles, to product documentation that would more typically, or maybe historically, be delivered as a longer format. So it can really handle everything from very componentized, atomized content, all the way up to very large books, thousands and thousands of pages, and then it manages the whole content life cycle around it, and presents a platform for actually getting it out through different channels to the end user.
Larry:
Nice. A lot of my listeners, I know, work in… There’s a whole range of ways you can publish stuff in the digital world now, but a lot of people come out of the conventional CMS world, where it’s kind of like the authoring process of putting the thing in the CMS, and then how it’s stored, and then how it’s ultimately displayed. There’s almost, not quite a one-to-one correspondence, but it’s a pretty… the same blob going through the system. I’d love it if you could talk a little bit about componentized content and its benefits to the kind of publishing that your clients are doing.
Patrick:
Yeah. So one of the things that you’ll notice is that I kind of avoided the term publishing in my description. I did that intentionally, because I knew this question was coming. And I think it’s that, traditionally speaking, when you’re talking about a CMS, I think most people, especially probably of my age and older on the internet, think Word docs, right? That’s like the most default CMS that everybody thinks about. And that’s a very page paradigm. You kind of go in and you make your edits and publish the page. Well, that works okay when all you’re really doing is delivering pages, and that’s what that was designed for, but in a more modern world where we’re designing content for many uses, many applications, many channels, many audiences, the page paradigm still works as one of the targets, one of the places people will consume your content, but it can’t be the primary mechanism that you’re designing your content around as you’re building it.
Patrick:
So what a lot of people who are designing these things today, doing information architecture today, are thinking about is componentizing content. So you’ll hear people refer to this as atomized content or micro content or component content or structured content. I mean, there’s a bunch of different names, and I don’t think you really need to get so fixated on which name you actually apply to it, but what it really is, is the idea that, as you’re creating content, you’re putting it into something which is structured, so it has the ability to have some semantics and metadata, either on or around or in it, ideally, all of those, and you’re framing this content in such a way where the piece of content is as small as possible while still being useful.
Patrick:
And typically, the way you’ll see that kind of approach is that the content should be about one thing. It should be one description. One idea. One answer to a question. One procedure. One policy statement. And it’s very much like it’s just… it’s self-contained. It can move by itself. It’s structured. It’s as small as it can be while still being useful. And then you assemble it into things and some of those things might be pages.
Larry:
Right, and one thing I want to quickly interject here is that you do a podcast, and I love that you model exactly what you were just talking about. It’s like, what’s the title? It’s called Content Components?
Patrick:
Well, it’s called Content Components.
Larry:
Yeah.
Patrick:
But we put the Content in front of the Components, just so people would actually be able to find it when they search. So when we talk about it, we just call it Components.
Larry:
Yeah. Components. So I love that you’re walking the walk everywhere you go, even on your podcast. But to that, all that stuff you’re talking about, and you mentioned in there, the reason that all of this is necessary is that we have these many different channels we’re pushing content into, and in many cases, increasingly, that content is personalized. And then you also mentioned metadata in there.
Larry:
Can you talk about how the sort of assembly process on the other side of this authoring process, that puts these, like you said, it might be a thing that looks very much like that old document that you saw 10 years ago, but a lot of these are going to be new kind of styles of content, right?
Patrick:
Right. Yeah, absolutely. And I think that a future-looking process of delivering content needs to have a blend of ways that it does that. So I don’t think PDF is ever going to die completely, because there is some utility of PDF and a lot of our customers will still use PDF to some capacity. It’s rare that it’s the primary way they’re delivering content. And if they are, we encourage them to think a little bit more about the customer experience to get some more experiences that are real-time web, those types of things. But that’s always going to have an artifact published style process, right? You’re either going to have an automated process which does it on some event, or you’re going to go in and you’re going to push a button somewhere, maybe a series of buttons, and it’s going to spit out a PDF.
Patrick:
And there’s nothing wrong with that. That’s a thing that’s going to… there’s a necessity to keep that and there’s value there, and just because it’s been done for a long time, doesn’t mean you should stop doing it or feel you should have to stop doing it. But when you’re starting to think about delivering content to people in more of an omni-channel paradigm, and I’ll define what I mean by that in a second, you have to start to think of this as more of an authored perspective, than a published perspective. So the reason that you want to think about it this way is because the concept of publishing really has meant for a long time, and I think it still does mean, that you create something. That you’re physically generating something or outputting something. Something is being created. That’s kind of going all the way back to publishing a book, but when you’re thinking about the way that content gets into an omni-channel-enabled environment, what really happens is you do something to say, “The content is ready and it’s available here.” And here is almost always an API of some sort.
Patrick:
So part of the reason that having your content structured and having it be presentation-agnostic is that when you say, “Here’s content,” you don’t want it to be, “Here’s content for this.” You want it to be, “Here’s content,” and then “This” decides what it’s going to do with it. And that might be your website. That might be in an app. It might be somebody else’s website. It might be a commerce site. It might be your product. There’s just a really wide number of things, chatbox knowledge bases, info systems like dashboards, whatever, all these different things, can consume this content when it’s structured and it’s authored, right? So you basically say, “We run on this open standard. This is the API where you access the content in that standard. This is the way that we go about alerting you when it’s authored.” And then people can hook into that and they can do whatever they need to do with it.
Patrick:
And this is really how you get to someplace where you can start to drive a truly omni-channel strategy around content. And you can get to a place where you’re looking at content in the right way, from an enterprise perspective. It’s becoming something which is operationalized, content ops, and it’s being done in a way where it can be collaborated on asynchronously, across teams, and in a truly standardized enterprise way.
Larry:
Right, then when you talk about standards, and you mentioned… Part of this, one of the underlying standards is the XML format, and DITA in particular, it just occurred to me that we haven’t really talked about that. And some of my listeners may not be familiar with those standards and technologies. Can you talk just a little bit about how the XML format and the DITA standard permit all this stitching together?
Patrick:
Sure. So, DITA is a type of XML. So is HTML. HTML is a type of XML, really broadly speaking, and DITA is a kind of a combination of some rules, some concepts, and then using the general mechanisms of XML, which is really just structured content. And DITA is really uniquely suited for a more enterprise approach to content, especially from an author delivery perspective. And it is for a few very specific reasons. And this isn’t to say that other systems can’t do this, it’s just that DITA has been designed to do this, and it works quite well in doing this.
Patrick:
One of the biggest things is the ability to have content which is presentation-agnostic, so it’s just content, semantic. Any system can decide how it wants to render it, but then also evolve that structure without interrupting other systems.
Patrick:
So if you get really nerdy about it, DITA stands for Darwin Information Typing Architecture. You can ignore that whole thing, except the Darwin part. The Darwin part means that it’s supposed to evolve. And what that really comes down to is you can came in and you can say, “We need more semantics at this point in our content in order to enable this thing downstream.” But in order to do that in most systems, you have to add something in which can be disruptive to other uses of that content. Like if you go in and you add fields or you modify fields, or you change structure, you have to alert all the different systems that use it in order for them to leverage it properly.
Patrick:
With DITA, you could go on and you can take, let’s say a note or a paragraph or a list item, and you could say, “Okay, this list item is now a warning note, or a warning list item,” or something like that, so it’s very specific in what it does. And that’s a bad example, but I’m just making it up right now. And since it’s based on a list item, it’ll act like a list item everywhere, except the place that you need it to act like a warning list item. And this means that one team is able to iterate on their business requirements and maintain the enterprise architecture and collaborate across all teams, and other teams can choose to adopt this at their own pace, if at all, but their usages of it, their business requirements, aren’t negatively impacted because it will fall back to that original thing, so you can evolve independently. And that’s one of the things that makes DITA really uniquely suited for enterprise architecture, enterprise content implementations.
Larry:
Yeah, and that’s a really helpful way to understand that, because the ability to evolve… like, I think of the difference with HTML, that they had to do a whole revamp for HTML5 to strip away some of this stuff that was constraining us to get to this. DITA has had that built in for… it’s, what, an 18 year old standard or something like… almost 20 years old or something, I think?
Patrick:
Hmm. That’s a great question. So it’s older than 10 years, I can say that much.
Larry:
Yeah.
Patrick:
It’s been around for a while. It’s been revved several times. It’s very stable. So the general paradigms of power and the DITA standard have been pretty effectively worked out since the 1.1 version, which is more than 10 years old, and almost everybody uses either 1.2 or 1.3. So it’s based on a lot of real world implementations and applications, and it’s been shown to work really effectively. So it’s a very stable standard. It’s very battle tested. It’s very strongly understood. And the process of a releasing a 2.0, I think it’s this year. It’s either this year or next year. I should probably be more involved than I am, but time is what it is, right?
Patrick:
And that is going to introduce some changes and resolve some of the issues that we’ve noticed along the way, but largely it’s built on the same exact paradigms, the same exact general concepts, in terms of how this should work and how it can be used and leveraged by a large number of people for really large content implementations.
Larry:
And the reason I brought that up is it seems like that that kind of extensibility and evolve-ability is what makes it useful, and there’s enough of a history, however long it exactly is, that it’s proven that it can do that.
Larry:
One thing I want to ask you about that, and I know there’s a little bit of a flame war going on this in some parts of the world, so I hope it doesn’t erupt here, but I have a lot of buddies who were more in the cool startup modes with JSON and other formats for storing content and authoring content, is that it occurs to me that JSON’s just a format. It’s not a standard. Do you see… I guess, is there any place in your world, does XML and DITA, do those play well with JSON? Is that kind of a moot point or is it like Blu-ray versus… the digital format wars kind of thing?
Patrick:
I’d say it’s somewhere in between. So there’s no mutual exclusivity between these two things, because even when somebody is using JSON as their connector format, so we author our content in JSON over our API, right? Again, there’s that word, “authored.” The content inside of it is either going to be a simple string or it’s going to be HTML, pretty universally, and you can take the content for a DITA document, and you can render it to HTML very easily. There’s very standardized ways of doing that and you can put it into JSON format, and then you’ve effectively accomplished the exact same thing. And as a matter of fact, the delivery API, the system that we provide as the standard mechanism for our customers to come in and get content which has been offered for consumption, is JSON. It’s not an XML API. JSON is better for web APIs, period. It just is.
Patrick:
But that doesn’t really say anything about how you store the content. So if someone tells you that JSON is a better storage format for content, they’re either not exactly understanding what’s going on there, or they’re papering over it for simplicity’s sake.
Patrick:
So if you look at a headless CMS, which are really good for a lot of things, they’re… as we’re actually evolving some of our content implementations, we’re considering using a headless CMS for some of our marketing activities. That’s something that we’re evaluating. But what a headless CMS is, is it’s really just a system that manages web forms. So you can go on and you can create a form, and some of those fields can be long text fields, and the actual content itself is almost always structured with Markdown, which is a very simple text-based format for applying a little bit of style in the content. So headings, lists. Markdown’s not great at tables, so typically tables are actually HTML, that kind of stuff.
Patrick:
And then when people go to access that content, the Markdown is typically rendered to HTML, and provided as one of the fields in the JSON object that is part of that object. So these things inter-operate very easily, and anybody who tells you that it’s JSON or XML is kind of missing the point because it’s the wrong part of the stack to be having that conversation.
Larry:
Yeah, and I think we’re increasingly in this world where everything is decoupled or increasingly decoupled, or the expectation that it’s decouplable, so that people can do stuff with it. And I guess the main point there is that for the most part, all this stuff plays together pretty well. And that, for you as an information architect or something in a big org, you just need to make some choices about which format. Like, Markdown worked fine for simple content, but if you get it all complex, you need to look at DITA or something like that.
Larry:
Hey, one thing I want to come back to, you’ve mentioned a few times. I’ve pictured your head being actually in cyberspace. You’ve mentioned… you use the word “Here,” and something being able to be anywhere, this authored content that’s ready for omnichannel. It’s there in this place. And one of the concepts we talked about before this podcast was the idea of digital proximity. I’m wondering if that’s the thing that stitches some of that… That can help me get my head around that geographic terminology that you’re using. Is that true, or…
Patrick:
Yeah, so digital proximity is one of my favorite terms right now, and I don’t know if it’s popular enough to call it a buzzword, but if it is, it’s one of my favorite buzzwords, and it’s because it’s this really interesting concept. So the idea of digital proximity is it’s intentionally relating the concept of being where your customer or where your audience is very much the way that physical proximity is. So if you kind of take the physical analog of it, when you are a store or you’re selling something, or something in the physical world, you need to be physically close, proximitous, to your customer, right? So you’re Walmart, you put a store in Rochester, which is where I live, right? Maybe you put a few of them there, because you want to be close to your customer.
Patrick:
And that was how business was done for a very, very long time. And the idea of being able to expand was the idea of being able to put your brand, your business, your company in closer proximity to your customer in a lot of ways. So obviously as digital media and the digital landscape has evolved, the ability to be close to your customer has not only become much more low cost, and you’re near zero, but not zero, it’s also evolved quite a bit. So there’s many different aspects of having different ranges of digital proximity too. So when you think about digital proximity, you have to consider, is the content in the right language for your user? In that language, is it localized, even? So the concept of localization just isn’t always translating. It can be actually like brain language to make sense to the user where they are, so localization can be an aspect of digital proximity.
Patrick:
Is the content in the systems that they’re using? So this kind of brings in and ties in the omni-channel aspect of it. As we move out into a world where it’s less, “I have a website, you have a website.” It’s more like, “We have websites, but we’re also maybe publishing content to Amazon or we publish content over here.” Or “We offer content to be consumed by this feed,” or whatever that may be, now we have to think about, “Where is our user when they need to be able to get information and content from us and are we providing the right content to them in that location?” And then finally, “Is the content that is being provided to the person in the place, the digital place, where they are, is it really the right content for them in that moment?” right?
Patrick:
And this kind of goes back to the whole right language, right device, right time, that kind of thing you’ve been talking about for a long, long time, but it’s a different way of thinking about it. In my opinion, it’s a much more tangible way of thinking about it. So are you delivering the content that is useful to them where they are in their language? And that doesn’t just mean English versus Spanish, as they need it? And this is the process of having good digital proximity. It’s being close to your customer in a digital world.
Larry:
Yeah. And you can see the… A couple of things about that. There’s, one, just that it’s a great way to think about it. It’s a great little bucket to put all this stuff in to think about how you want to stitch things together. But you’re also reminding me when we’ve just kind of been adding, I think the original definition of content strategy was, “Getting the right content to the right person at the right time.” And then they said, “In the right channel and the right medium,” and we just keep adding onto that.
Larry:
And two episodes ago, I had… I think it was two episodes ago, I had Cheryl Platz on, she just wrote a book called Design Beyond Devices. I don’t know if you’ve seen that yet, but she adds to the… I picture it. I told her in the interview, I pictured it, at the end of these omni-channel these many channels, we now have this hydra head of what she calls “modalities,” of the kind of interaction, whether it’s an old school, visually, you have a GUI, or your new visual interface with an Alexa or something like that, or even, more haptic, touch related things or gesture-related interfaces or even the ambience of the place you’re in.
Larry:
But it seems like… I guess my intent in mentioning that is that it seems like the way you’ve done things, you’re already ready for that. They’re like, there’s this generic, future-proofing in the way you componentize content that makes you ready for new insights and developments like that. Can you picture dealing with what I just talked about with the way that you handle componentized content now?
Patrick:
Yeah. Let me give you a really specific example of that. So in our system, which is just using, leveraging DITA, so this would be true of any DITA system, when you create a procedure, it’s not just a numbered ordered list. So if you’re in Word, what you do is you go click the number list button, and you write your stuff, so it’s numbers, right? That’s your procedure. In our system it is actually physically steps. These things are called out. And before the steps, you can have prerequisites, you can have context. You can… and these things are really semantically called out too.
Patrick:
So the fundamental difference here is that if you think about this being rendered on a webpage, the difference isn’t probably huge. Almost all of our customers do render procedures a little differently, because it does help the user understand that what they’re looking at is to be interpreted differently. But now think about a voice interface and think about how different it is for a voice interface to be able to say, “This is the context in which you will do this. Are you in this context?” And you can say, “Yes. That’d be great.”
Patrick:
The prerequisites for following this procedure are this. If those are just paragraphs, there’s no way the voice interface is going to know to take those things, right? It’s very, very simple with the content you create in our system. “Prerequisites are fulfilled? Great. Let’s move on. Step number one, blah, blah, blah, blah, blah, blah, and do this.” Well, one of the other elements that’s semantic in a step is, “Expect it resolved.” So it can ask you, “Did you get this resolved?” And you could say yes or no, and then it can move from there. But by having these simple bits of semantics wired into these procedures, what you’re doing is you’re giving a computer, which can manifest itself as voice interfaces, as steps rendered inside of an app, as all different kinds of things, the ability to actually intelligently understand what it is that the author wanted the interaction of the end user to be.
Patrick:
So there is no doubt that right today, even if you’re just publishing super basics, PDF, website, right, that publishing, creating content in a structured fashion, in this case DITA, is a stronger mechanism. It’s going to give you the ability to publish those two channels more efficiently, more consistently, reuse that content, higher scalability, all that stuff, so it’s a win today, but it also gives you the future, as you start to move into other formats, other channels, other interactions, other customer experiences, be they an app, be they voice, be they mobile things, be they any other set of interactions that you may want to have as your user, as Internet of Things comes up, and digital customer experience just permeates everything.
Larry:
That’s great. And I love that you used the word… you talked about the intelligence of the content in there. It’s this smart stuff that can adapt to all kinds of new situations. But, anyhow, we’re coming up on time, Patrick. These things always go so quickly and I could talk forever, but I like to keep these around a half hour. I want to make sure, though, is there anything last, anything that has come up that you want to make sure we elaborate on, or just that’s on your mind about componentized content or, or digital content in general?
Patrick:
I think that the world is going through a transition right now that’s pretty interesting. So I think that there’s kind of a 1.0, 2.0 for digital transformation as it relates to content. And I think we’re kind of in the 2.0 phase, where we got to the point where our content is literally for the most part digital, right? So at worst it’s in Google Docs or Word or something, so we’ve kind of stalled the getting everything into more of a digital paradigm in that way, but where we need to be going with this is we need to get our content to the point where it has the richness of metadata and semantics that are really going to support the next set of things that we want to do with it. And this isn’t always the easiest thing in the world to do.
Patrick:
It’s very, very easy to think of content as kind of, the thing that goes into the system, but the system is the thing that solves the problem. And that’s fundamentally backwards. The right way to think about the future, especially, is not, “We need a knowledge base. So let’s buy a knowledge base. A knowledge base is going to solve our problems.” That’s not true. It’s never been true. Maybe that’s the way we thought about it for a long time, but it’s never been true. The content in the knowledge base solves the problem. The knowledge base facilitates some of that content getting to the end user. But as we go to scale up and really get to the point where we’re at a full omni-channel world, where we can deliver content wherever our customer is, in any form of digital channel, digital experience to really meet them, meet their need, we have to be in a place where we’ve decoupled that concept.
Patrick:
And we recognize that, as we’re working on content, we’re working on content because content is the thing that solves the problem. It’s not any one individual system in the ecosystem. And I think that that’s a shift that we’re starting to see. It’s really helpful. I mean, our business thesis is based on it, so obviously I like it.
Patrick:
But even if it weren’t our business, even if we disappeared, it’s still the right change for the world, and it’s going to create a smarter, more efficient, better experience and better customer, better interaction, more knowledgeable world around us as we get through this. So it’s really exciting. It’s a really exciting time to be in content. I’m glad to be here. I’m glad that I get to contribute, even if it’s just in my one little space, whatever it may be, and I think it’s the most interesting thing going on in business today.
Larry:
And again, this always happens too. You’ve just introduced an idea for a whole other podcast episode about how to take, even if… because even in your little place as you describe, that the impact can be so huge if we connect with others, but I might have to have you back on at some point, but thanks again. There’s so much, Patrick. Hey, one very last thing. I want to make sure. How can folks get ahold of you? What’s the best way, social media or your website, or… What’s the best way for folks to follow you?
Patrick:
Oh, sure. So social media is best really. So I’m just @PatrickBosek on Twitter. That’s the easiest way to get a hold of me. The website has your standard contact form type stuff, which is easyDITA.com, which is our product, and you can obviously go there if you’re interested in what we do. There’s a lot of information and you can always talk to us. We’re quite friendly people. As it relates to… I’m also on LinkedIn, which I guess is slash in slash, my name. I think it’s just Patrick Bosek, and yeah. So I’m pretty easy to find. I also go to a lot of the conferences and I’m on the Write The Docs Slack community so you can find you there too, me you want to just… if you want to direct message me, go and sign up to Write The Docs, and I love to chat, so message me.
Larry:
Excellent. Easy to get a hold of you. Perfect. Well, thanks so much, Patrick. I really enjoyed the conversation. Good stuff.
Patrick:
This was a lot of fun. Thanks, Larry.
Leave a Reply