The following is an extract from an email I sent in April 2023 - of interest because it feels like this situation hasn't changed dramatically in the intervening 2 years.
--
No doubt you've seen some of the media coverage around new AI models, especially OpenAI GPT models, and have been thinking about how to use them with your teams.
I've been using them quite extensively, and the more I use them, the more impressed I am - i.e. I'm pretty sure this is not just a fad, but is a serious technological breakthrough that has the potential to revolutionise quite significant parts of Civil Service work.
I thought it may be useful to note a handful of the areas where it feels like there is low hanging fruit where the models may be appropriate for serious applications:
-
Zero shot labelling. This is an ability to take an input document and categorise it according to any criterion of your choice without training a new AI model (that's the 'zero shot' bit). For example, taking a sentencing transcript as an input and the model categorising whether there was 'use of a weapon'. The important technical advance here is that these new models understand semantics, not just keywords, so the transcript could contain 'the victim was stabbed', and the computer would recognise this as use of a weapon.
-
Semantic search across a large corpus of potentially unstructured documents. There's an example here of using these models to analyse 1000 pages of Tesla's annual reports: Tweet.
Both (1) and (2) have been 'somewhat' possible in the past, but have been lots of work and haven't worked that well. What's new is that these are now much easier and more accurate.
- Code completion. As a data scientist, I'm getting ChatGPT 4 to write probably 50%+ of my code. So at the very least, these models are a huge productivity amplifier to data scientists.
The biggest challenge is around data sharing and using these tools with sensitive government data. It feels like getting a head start on understanding these legal issues may be an important first step.