14 August 2009
Brad Scott
I’m not at all surprised by the news item in today’s Bookseller1 that the move to digital technologies has revealed a major skill gap in the publishing industry. It feels like that has been the case for a long time. Over the past few years, I’ve often commented on the limited extent of the expertise in digital technologies in some major publishing houses, who tend to outsource almost everything.
Outsourcing is no bad thing; it’s what publishers do, and it works well. But, in the digital domain, if done to excess it can feel as though the publisher abdicates responsibility for their digital business, placing in the hands of a third party technology company, thereby letting the technology drive the business, rather than the other way round.
I’m certainly a beneficiary of that outsourcing, and my current clutch of clients tend to be among those publishers who have a good range of skills for electronic publishing, but even they would admit that there are always holes and gaps in understanding and practice that get pushed to one side amid the daily routine of getting new products to market.
I can understand some of the gaps. After all, why would many publishers want to get too heavily involved with XML schemas, but at least some basic XML knowledge should be mandatory these days, if only for fixing typos.
So, I’d be interested to know where you see the gaps, either in your own skill-set, of your business, or of some of your competitors?
- Neilan, Catherine. “Digital skills gap now ‘critical’ for publishers.” 14 August 2009. The Bookseller. http://www.thebookseller.com/news/94322-digital-skills-gap-now-critical-for-publishers.html.rss. The Skillset report was also the basis of a piece in the Guardian, though its headline seems to have somewhat misrepresented the substance of the findings: Holmwood, Leigh. “Literacy level of recruits now a major concern for media, report finds.” 13 August 2009. The Guardian. http://www.guardian.co.uk/media/2009/aug/13/literacy-concerns-media-recruits-skillset-report
Posted in Digital publishing | 1 Comment »
26 June 2009
Brad Scott
After working on a number of online dictionary and reference projects over the years, it’s always nice to see some neat innovations. OUP have just made some of their wonderful Oxford Paperback Reference titles available on the iPhone. Maybe it’s time I got that gadget.
I was also excited to read the ReadWriteWeb item about Wordnik[1]. It may not include the breadth and currency of other online dictionaries, but it has creatively pulled together a lovely range of supporting materials in a nice user experience. It shows how effectively you can utilise data that is available via APIs from other sources. Its dictionaries include American Heritage Dictionary, Websters (1913) and a few others, but what makes it exciting is the other items: the examples from texts at Project Gutenberg; thesaurus items; Twitter usages; pictures from Flickr; a graphical view of the occurrence of the word over time; etymology; and pronunciation. Users can add their own notes, as well as pronunciation examples. No doubt more funky features will get added. Do other dictionary publishers need to raise their game?
- Lardinois, Frederic. “Enamored With Words? You’ll Love Wordnik.” 9 June 2009. ReadWriteWeb http://www.readwriteweb.com/archives/enamored_words_youll_love_wordnik.php
Posted in Dictionaries, Digital publishing | No Comments »
23 June 2009
Brad Scott
Is the momentum building on the whole linked data and semantic web thing?
Finally catching up on some reading, I saw the piece in the Guardian about how Tim Berners-Lee is to help the UK government make its data more easily available online[1]. This can only be a good thing for helping to get the awareness out there, not only of how to do it, but also that it can work. The Linked Data initiative certainly has some useful material on making it happen, and the spring report from PriceWaterhouseCoopers also focuses on the semantic web and how some businesses such as the BBC are now beginning to engage with it.
Last week at the Semantic Technology conference held in San Jose the keynote from Tom Tague of Thomson Reuters’ OpenCalais gave a useful introduction to the trends in this very interesting area.[2] There should be more details about many of the papers and other talks appearing on the conference website soon.
Making a start with the semantic web should be getting easier, as the recent announcement about Google Rich Snippets made clear, though as Richard Padley noted in his blog, Google’s use of RDFa is not completely kosher.[3] In parallel with that development, Common Tag has also opened up an RDFa-based means of getting a decentralised interoperability between tags.[4]
How far have you got with your engagement with the semantic web? I’d be interested to know to what extent publishers are starting to put a toe in the water.
- Arthur, Charles. “Web inventor to help Downing Street open up government data.” 10 June 2009. http://www.guardian.co.uk/technology/2009/jun/10/berners-lee-downing-street-web-open
- MacManus, Richard. “The State of the Market in Semantic Technologies.” 16 June 2009. ReadWriteWeb. http://www.readwriteweb.com/archives/the_state_of_the_market_in_semantic_technologies.php
- Padley, Richard. “What does Google’s RDFa support mean for publishers?” 18 May 2009. The Discovery Blog. http://blogs.semantico.com/discovery-blog/2009/05/what-does-googles-rdfa-support-mean-for-publishers/
- O’Dell, Jolie. “Common Tag Brings Standards to Metadata.” 10 June 2009. ReadWriteWeb http://www.readwriteweb.com/archives/common_tag_brings_standards_to_metadata.php
Posted in Digital publishing, Semantic web | No Comments »
13 June 2009
Brad Scott
In Oxford yesterday and couldn’t quite believe this one outside the George Inn in Botley. I had to go back to have another look and then got snarled at by a cyclist since I was rather distracted by the migrating character. I’ve not seen this particular occurrence before, though maybe someone in the Atrocious Apostrophe’s group on flickr has.
Posted in Apostrophes | 2 Comments »
11 June 2009
Brad Scott
I’ve been involved with the publication of products containing fairly large amounts of data for well over a decade now, and finding some old articles of mine made me think about what has changed for publishers who handle such content.
Certainly, the volume of data for individual projects has increased, which in turn has meant that publishers have got a bit better at managing and archiving their data assets, though I wish that were more generally true; valuable data can still be stored in the equivalent of a shoe box with inadequate documentation. Suppliers are generally better (and cheaper) too, not least since they now have more familiarity with the important data standards. Even so, data testing and QA can still be problematic, and that is equally true internally within publishers.
Compared with a decade ago, the user requirements and expectations tend to inform data design more, and some publishers certainly have well-thought-out and documented data models that have been constructed with usage in mind. But, the technology platform that delivers the content can sometimes be what shapes the data, rather than the user, and that can lead to some ugly and inflexible choices.
Nevertheless, when faced with a new data creation or migration project, there is still an unavoidably large amount of grind and planning required to get it right. That’s what I found so interesting re-reading these ten-year-old articles. Though the delivery technology has changed, the processes and thinking required isn’t very different, and I could have written similar things about many of the projects I’ve worked on since then.
The articles themselves date back to when I was digital publisher at Routledge in the late 90s. One describes the creation of Asia: Official British Documents (1998)1, which was published with the British National Archives, and comprised 40,000 page images of original archive content plus metadata; and the second focuses on the data of the Calendar of State Papers Colonial series (1999).2
The former was mostly an exercise in tracking bits of paper in a database, but the latter was an SGML implementation, drawing on the models of the Text Encoding Initiative (TEI) and the Model Editions Partnership. In the years since then I’ve been extending the TEI for several other projects, such as the New Palgrave Dictionary of Economics, and the MLA Handbook, which has meant adding in MathML and the CALS table model. Fundamentally though, the process for planning and creating the data for these products hasn’t changed much at all.
- Scott, Brad. “Creating an Image Edition of Historical Material: Asia: Official British Documents, 1945-1965″ 1998. http://www.brambletye-publishing.co.uk/consultancy/creating-an-image-edition-of-historical-material/
- Scott, Brad. “Retrospective Data Conversion in a Commercial Publishing Environment: The Calendar of State Papers, Colonial” 1999. http://www.brambletye-publishing.co.uk/consultancy/retrospective-data-conversion-in-a-commercial-publishing-environment/
Posted in Data, Production | 2 Comments »