Episode 6
HockeyStick #6 - AI-powered Developer
Generative AI in Software Development: A Future Without Coders?
In this episode of HockeyStick, Miko Pawlikowski interviews Nathan B. Crocker, CTO at Checkr and author of 'AI-Powered Developer,' exploring the impact of generative AI tools like ChatGPT and Copilot on software development. They discuss the book's insights into using AI as a junior developer, its appeal to different levels of software practitioners, and experiences with generative AI for coding tasks. The conversation covers AI's role in designing, testing, refactoring, and understanding code, addressing job security concerns for software engineers. They also tackle the effectiveness of local LLMs versus online models, the evolving landscape of AI in coding, and future directions for developers using AI tools.
00:00 Welcome to HockeyStick: Exploring Generative AI for Code
00:23 Diving Into AI-Powered Development with Nathan B. Crocker
00:44 The Practical Guide to AI in Coding: Insights and Experiences
02:47 The Revolutionary Impact of AI on Software Development
04:46 ChatGPT: A New Era of Coding Assistance
08:57 The Magic of Copilot in Your IDE
10:40 Navigating the Challenges of Custom Code with AI Tools
14:46 Designing Software with AI: Beyond Just Code
17:45 Refactoring and Upgrading with AI: A New Frontier
20:27 The Quirks of AI: From Training Data to Practical Use
23:34 Exploring the Limits of AI in Software Testing
24:01 Exploring AI in Testing and Development
24:25 Harnessing AI for Software Testing
25:08 AI's Role in Code Depreciation and Asset Management
25:59 Understanding and Describing Code with AI
28:47 Security Insights and Ethical Considerations in AI
32:05 AI in Infrastructure and Deployment
37:36 Evaluating Local LLMs and Their Capabilities
42:18 The Future of Coding and AI: Predictions and Perspectives
44:01 Closing Thoughts and Next Steps for the Author
Transcript
I'm Miko Pawlikowski, and this is HockeyStick.
Speaker:Today, we're talking about generative AI for code.
Speaker:you know, whether software engineers should start worrying
Speaker:about their job security.
Speaker:We're talking about chatting to LLMs to help you design, build,
Speaker:test, and understand software, both online and offline, as
Speaker:well as tools like the Copilot.
Speaker:I'm joined by Nathan B.
Speaker:Crocker, the author of AI-Powered Developer published by Manning
Speaker:and the CTO and co founder at Checkr, a tokenization startup.
Speaker:He just finished his book and we're covering his experience with
Speaker:using AI as a junior developer.
Speaker:Welcome to this episode and thank you for flying HockeyStick.
Speaker:So I, had the pleasure to read your book.
Speaker:I would say that it's a real practitioners guide to using
Speaker:AI-power, to, work with code.
Speaker:It's, fairly light on details, so nothing to scare people off.
Speaker:It basically jumps right into how to get value out of, artificial
Speaker:intelligence, if you're working with code.
Speaker:Would you like to tell us a little bit how you ended up writing this book?
Speaker:I had actually had a number of, co-workers and other developers who are telling me,
Speaker:"Hey, you got to check out this stuff".
Speaker:It's really something.
Speaker:So this was November of 2022.
Speaker:and, I had no idea what it was.
Speaker:I, I started looking into it.
Speaker:It piqued my interest, but, I needed a really, a deep motivation to actually
Speaker:dive into it and really incorporate it.
Speaker:Cause you just use something, periodically, lightly, You're
Speaker:not really engaged with it.
Speaker:so I, I pitched Manning the book, they liked the idea and it's
Speaker:really is my journey through, learning how to use these tools.
Speaker:my journey is really going to mirror the practitioners that are
Speaker:reading it as they work through it.
Speaker:So who is it for?
Speaker:What's the requirement to get value out of this book?
Speaker:should have some familiarity with Python.
Speaker:if I just take a step back, it's really for anyone.
Speaker:early journey, mid journey, as a software developer, as a software architect.
Speaker:I suppose as a business analyst, you could derive some value.
Speaker:All the examples are in Python, There's a couple of microservice, chapters, so you
Speaker:should have some familiarity with that, but it really is about, taking you through
Speaker:things that you may or may not be familiar with and working with gen AI to really
Speaker:teach you some of these concepts as well.
Speaker:Yeah, maybe you graduated from computer science program and you're like fresh out
Speaker:of college and you want to know how to take your development to the next level.
Speaker:That would be really be the target demo.
Speaker:I think the next level part of it is actually the keyword here.
Speaker:One of the first things that people see when they open your book, I
Speaker:think it's literally like page one, is the silent promotion.
Speaker:Everybody all of a sudden overnight became an engineering manager.
Speaker:Who basically can have a pretty good, junior working for them for free with
Speaker:no labor laws or anything like that.
Speaker:Why is that such a big deal?
Speaker:it's such a big deal because it used to be the rubber duck.
Speaker:Like you'd have a partner that you can work with, that you can tell to
Speaker:do things that you can bounce ideas off of, you can look to for some
Speaker:answers, because they're thinking it's going to be different than yours.
Speaker:so they might have a different tack and a different approach.
Speaker:you could farm out a lot of the work that you don't want to do.
Speaker:The repetitive boilerplate, CRUD operations.
Speaker:it's all there, but like any junior developer, super smart junior developer,
Speaker:that is, they can perform some bafflingly poor, thinking and wind up with just some
Speaker:nonsense that isn't necessarily usable.
Speaker:so you gotta watch them.
Speaker:I suspect that if we were to partition the body of listeners to this, or maybe even
Speaker:developers in general, There'll be almost none in the category of I haven't heard
Speaker:of it or I haven't even played with that.
Speaker:Other than people who might be on a very long vacation, "Cast Away" style.
Speaker:I don't see how you can really escape that.
Speaker:Then you probably have a category of people.
Speaker:'Okay, I played with it.
Speaker:I went to talk to ChatGPT, spit out some code.
Speaker:I saw roughly what you can do, but I never really got much value out of that'.
Speaker:And then the category of people who actually go and use it day-to-day.
Speaker:Because it is helping their job.
Speaker:There might be some caveats to that.
Speaker:Obviously, data privacy and not knowing where the code actually goes and not
Speaker:knowing whether it's going to be trained on and that kind of stuff that can,
Speaker:throw a wrench in some people's work.
Speaker:So should we start with the category of people who might have played with
Speaker:it a little bit, and, they went to ChatGPT, asked the same questions,
Speaker:"how tall is this building?"
Speaker:And, "can you search that for me?"
Speaker:And they stalled there, from a basic development point of
Speaker:view, what kind of value, can we extract just chatting to ChatGPT.
Speaker:What can it do?
Speaker:I had a good, an interesting, experience, very early on when I was doing research
Speaker:for the book, I had it on my phone, I would carry it around and just
Speaker:periodically I would hand my phone to someone just to give them a taste.
Speaker:I remember I was at a party and a woman, she was, an expert, it was
Speaker:archeology or maybe it was art history.
Speaker:And she really, started asking me some questions.
Speaker:Some of them were factually incorrect, but over the course of her conversation
Speaker:with ChatGPT, she became really impressed with the accuracy and justification for
Speaker:some of the answers it was providing.
Speaker:She would say like, 'why did you refer to this as the oldest, example of this
Speaker:architecture or painting style?' And it gave her a fairly convincing reason.
Speaker:I think there's value in, exploring this technology, even if you're not
Speaker:going to use it in your everyday development effort, just because it
Speaker:gives you a sense for of the future.
Speaker:things are going to dramatically change, there are implications that are
Speaker:rippling all throughout academia now.
Speaker:I feel that you just should keep yourself informed, about what's coming and where,
Speaker:these changes are going to be made and how it could potentially affect you.
Speaker:So from the point of actually going and using that, the way that, you described
Speaker:in your book, what's the experience like at this very lowest level of just
Speaker:launching ChatGPT, asking some questions.
Speaker:Because I remember doing a, a while back, it would spit out some code and I had to
Speaker:copy, paste it, and I had to add imports.
Speaker:how good is it right now?
Speaker:you're fresh of writing chapters about that.
Speaker:How useful is it?
Speaker:it's going to, it largely depends on if you're going, 3, 3.5 or 4.
Speaker:4 is much better, exponentially better than 3.5, but, it's solid, I would say.
Speaker:there's a lot of caveats.
Speaker:Most likely going to be looking at code that is one to two years old.
Speaker:so it was trained on data that was, potentially the old version of a library.
Speaker:they may have had breaking changes.
Speaker:especially if there's a fast moving language, like I, I was trying
Speaker:to build something at Rust and I just, I asked ChatGPT to generate
Speaker:some code and it wouldn't even compile with the newest compiler.
Speaker:It's good, I would say, but it's not perfect.
Speaker:It's a long way from perfect.
Speaker:What are some remarkable limitations that you bumped into?
Speaker:You must have seen some interesting, funny stuff.
Speaker:Can you share some of that?
Speaker:I haven't seen really outrageous things where it was just making up,
Speaker:libraries or frameworks or anything.
Speaker:it's, I would say it's unremarkable in its banality.
Speaker:the Rust example was probably the best one, but it wasn't great.
Speaker:I wish I had something funny, just jump ahead a little bit.
Speaker:I had a really hard time in the testing chapter trying to get it to
Speaker:write good tests and maybe You know, I don't even remember if I mentioned it
Speaker:in the book, but at one point I just gave up and wrote the test myself.
Speaker:because it was very hard to get it to understand what was the unit under test.
Speaker:what was I actually trying to accomplish with my test.
Speaker:No matter how much context I added, it was always trying to do
Speaker:something just completely different.
Speaker:when I was reading that chapter, I was also thinking at the
Speaker:back of my head, "What does it say about the training data?"
Speaker:The tests are so poor.
Speaker:yeah.
Speaker:Yeah.
Speaker:Are all those tests just so poorly written that, that's where I end up?
Speaker:But yeah, let's touch on that in a sec
Speaker:who needs tests?
Speaker:We'll test it in production.
Speaker:It'll be fine.
Speaker:There you go, testing in production, everybody.
Speaker:yeah, don't worry.
Speaker:We can cut that out.
Speaker:that was a joke.
Speaker:the most annoying bit was just that you have to chat.
Speaker:So then obviously you've got things like Copilot that you also cover in your book
Speaker:or just plugs in your VS code or whatever.
Speaker:What's.
Speaker:The added value of that.
Speaker:Is it just that you can work as auto completion and it's more syntaxic
Speaker:and you don't have to copy the code.
Speaker:Do you get any other bonuses out of that?
Speaker:the real value to a tool like, Copilot, again, versus ChatGPT is it
Speaker:does keep you in the IDE and it can keep you in that flow state, where
Speaker:it's only you and the code, right?
Speaker:Whereas you're not having to pull yourself out of the context,
Speaker:move to a different window.
Speaker:and.
Speaker:for certain projects, the actual code quality for Copilot was better.
Speaker:just the, on a line by line basis or, class by class basis.
Speaker:that's almost certainly due to the fact that it was fine
Speaker:tuned specifically for code.
Speaker:that's the main benefit I've found It's always adding helpful suggestions,
Speaker:sometimes not so helpful too suggestions.
Speaker:Like I don't need it to add a comment about the name of the file that I'm
Speaker:working on, but, if I can start to define a method and then it gives me a
Speaker:possible implementation, even if I don't accept it, it's at least, showing me one
Speaker:possible implementation that I could use.
Speaker:maybe it's not the exact one, the one I wanted, but having that suggestion
Speaker:can be very valuable to clarify my thinking or to even, change it.
Speaker:Maybe it's a better implementation than I was thinking of.
Speaker:So that's, those are the major advantages I found.
Speaker:It always works very well in demos where you've got, the usual suspect
Speaker:and HTTP server in a popular framework in a popular language and
Speaker:you do something that has been done to death million times on Github.
Speaker:how well does it work with custom code base?
Speaker:Oftentimes you find yourself in a situation when your company has a
Speaker:decent or a large amount of code libraries, stuff that obviously
Speaker:wasn't trained on, because it's not in public domain, it's not on GitHub.
Speaker:how well does it work with this kind of situations?
Speaker:We'll face some challenges there If you're working in a very niche
Speaker:problem, for example, you're trying to write, an API gateway or something.
Speaker:I suppose there's probably a good open source examples out there, but if you're
Speaker:working in a, a fairly niche industry, everything is going to be closed source.
Speaker:you'll probably struggle, with, it's suggestions.
Speaker:I don't think they're going to be particularly helpful.
Speaker:although it's going to try and, in that trying, maybe it does inspire you.
Speaker:maybe it does give you one possible implementation.
Speaker:it's really good at just generating something.
Speaker:and if nothing else, It can help you plan your approach.
Speaker:and you could ask it questions in line and have it answer, one of the more
Speaker:interesting things that I found as I was working with the Copilot specifically,
Speaker:one of the almost magical things is you type in a question in a comment
Speaker:and then suddenly you prompt it for an answer and it'll give you one.
Speaker:You're like, that's fairly interesting.
Speaker:I wouldn't have thought, to take that tack, but then I used it over and over
Speaker:yeah.
Speaker:there are this moments of magic and a famous quote about
Speaker:sufficiently advanced technology being indistinguishable from magic.
Speaker:Yeah.
Speaker:People get that a lot.
Speaker:And I agree with that.
Speaker:but how does it actually work?
Speaker:So let's say I've got my VS code open and I've got some code and it's got
Speaker:some imports and existing code base.
Speaker:does it upload the whole thing to OpenAI, to be able to generate the useful things?
Speaker:What's the context that OpenAI ends up having somewhere in the training data?
Speaker:yeah, it is going to encode, the context, which will be result in a good
Speaker:portion of that code being uploaded.
Speaker:OpenAI promises the pinky swear that, it's not being saved anywhere.
Speaker:I don't think we have any way to assess whether that's true or
Speaker:not to delve into conspiratorial thinking, but, you should be careful.
Speaker:certainly if you're working on proprietary software but that's why there are
Speaker:other alternatives that are entirely offline, that you can delve into if
Speaker:you're really very privacy concerned.
Speaker:cause yeah, frankly, we don't know how it's being used
Speaker:once it leaves our machine.
Speaker:we'll definitely touch base, on LLAMA and, other alternatives that
Speaker:you discuss in your book, but.
Speaker:is there a way to control or at least tell it, 'okay, only upload this,
Speaker:folder', or is it just fully automatic?
Speaker:It just decides by itself what it sends.
Speaker:you can tell it, but is it going to honor that?
Speaker:I don't really have a good answer.
Speaker:Okay.
Speaker:So something to check
Speaker:yeah.
Speaker:Something to check.
Speaker:for anybody who's now browsing manning.com, there is a live version
Speaker:where you can see elements of the book.
Speaker:Figure 2.18 is a nice summary.
Speaker:There is a bunch of figures, there is circle for unsupported, triangle for
Speaker:supported, and a square for exclusively is comparing ChatGPT, just being used
Speaker:by itself to Copilot and CodeWhisperer.
Speaker:and it's summarizing whether it can generate methods, classes,
Speaker:projects, generate documentation, switch languages and stuff like that.
Speaker:So for anybody who wants to delve a little bit more into the
Speaker:details, I think that's very handy.
Speaker:For anybody who might be beyond that, so they went to ChatGPT, they
Speaker:spoke to it, and they got some code, and it was a generally pleasant
Speaker:experience, and they want more.
Speaker:The use Copilot.
Speaker:what's the next kind of checkpoint?
Speaker:Where do they go from there?
Speaker:How do they start designing software a bit, higher level
Speaker:than just snippets of code?
Speaker:How useful is the AI in here?
Speaker:for example, I was designing something yesterday and I was working through
Speaker:a conversation with ChatGPT that is one of the key things that the ChatGPT
Speaker:excels at is helping you to design the software to really underscore that.
Speaker:not just to design your application.
Speaker:not just, lines of code, but it's perfectly capable of that.
Speaker:not even just the classes, but like here's the patterns that you want to apply.
Speaker:Here's the architecture.
Speaker:have it generate some of the documents in a text format.
Speaker:So plant UML or mermaid, like those are really.
Speaker:What's those are really good, useful things, because then you can always
Speaker:take those, save those and pass them back to ChatGPT, to refresh the context.
Speaker:so yeah, As a co founder of, and the CTO of a startup, I found it
Speaker:really invaluable, as a partner to help me design that software.
Speaker:I think one of the things that really opened my eyes was that I never thought
Speaker:to talk to ChatGPT about open source alternatives, and maybe trying to
Speaker:select a database and talking about the different properties, like it
Speaker:was just second nature for me to open the different docs and just start
Speaker:comparing features and stuff like that.
Speaker:And it never occurred to me that I can just go and ask ChatGPT because it's
Speaker:got quite a lot of knowledge about that.
Speaker:Yeah.
Speaker:think in the book you're talking about, open source alternatives
Speaker:to what you're writing, which is
Speaker:an IT asset management, system, Actually, I don't know if this part's going to work.
Speaker:just, so just be aware or just be advised.
Speaker:I got a lot of feedback that it was really boring, right?
Speaker:That people didn't like the actual project that you have
Speaker:to work on throughout the book.
Speaker:But I wanted it to be like a boring book on a boring topic, boring,
Speaker:application, because most of what we write is not interesting.
Speaker:It's we pick up data and we shuffle it and we move it around, right?
Speaker:A lot of what we do is not exciting.
Speaker:it was definitely intentional.
Speaker:but, again, maybe something to fix in this in a second edition, if it's coming.
Speaker:but one of the more interesting things about my engagement model with these
Speaker:tools as I worked with them, to pick up on what you were saying about learning
Speaker:more about a database or having it, help it select database or selecting
Speaker:open source projects, Is very early on.
Speaker:I was being extremely prescriptive.
Speaker:I would say, create, software that's using this library in this framework
Speaker:and, this language and all of that.
Speaker:but later on, and even to this day, when I have a problem, I
Speaker:feed in the business requirements.
Speaker:And then I ask it to make recommendations for me.
Speaker:and then I can assess those.
Speaker:but at the very least it starts the process.
Speaker:it gets it going.
Speaker:so hopefully, that answered your question or was at least in the neighborhood.
Speaker:Yeah, definitely in the neighborhood, same district.
Speaker:Same zip code.
Speaker:Yeah.
Speaker:the way my mind works is that when I hear the idea of a free, junior available 24/7
Speaker:my mind wanders to things like already mentioned docs, We hinted at tests coming
Speaker:a bit later, but I think one of the things that are painful in more than one
Speaker:way and people never want to do them is refactoring and upgrading to a new version
Speaker:of something or maybe changing language, which is surprisingly labor-intensive.
Speaker:It always ends up being more work than it looked initially.
Speaker:How good, is the AI at the moment in this kind of things?
Speaker:Refactor, rewriting in a different language, upgrade a library.
Speaker:Can you just say:
Speaker:' Hey, this is a library with a breaking change.
Speaker:Give me the new version of the library and updated tests and everything'?
Speaker:if it wasn't, it's post breaking change was in the training
Speaker:data, you should be fine.
Speaker:if not, you're going to have a more involved conversation.
Speaker:but more generally, it does really well in translating from one language to another,
Speaker:specifically programming languages.
Speaker:I couldn't.
Speaker:assess its, quality of English to French or something like that.
Speaker:But I can tell you, there was a few examples where I was working in Python
Speaker:and then I said, 'Oh, what would this look like in Go?' And it gave me
Speaker:just a literal translation into Go.
Speaker:And I was like, this doesn't feel very idiomatic, make it idiomatic
Speaker:And it would be, as good or better than I would have written it myself.
Speaker:so it does surprisingly well in going from one language to another.
Speaker:it can, and then on to refactoring you can ask it for certain patterns that
Speaker:you may want to apply as you refactor, different design schemes, like Maybe
Speaker:I need to pull out an interface.
Speaker:Maybe I need to, some kind of like parent class.
Speaker:maybe this needs to be an adapter or you take your pick from the gang, the gang
Speaker:of four, and it's, it knows them and they can provide examples in any language, that
Speaker:you can think of that it was trained on.
Speaker:so it can take a lot of way, a lot of that drudgery away and
Speaker:a lot of that anxiety away.
Speaker:One of the most important benefits that we can derive from at least their
Speaker:current implementation of these tools and of genAI is to just keep us going,
Speaker:to keep us motivated, to keep us engaged, to keep us building software.
Speaker:it, it can be really mentally taxing and this can help ease some of that,
Speaker:intellectual heavy lifting, not that we should just suborn our thinking
Speaker:to it entirely, but it can help.
Speaker:Did you notice any discrepancies between quality, in different languages?
Speaker:Because what I'm picturing is that the body of training data
Speaker:came from somewhere like GitHub,
Speaker:probably.
Speaker:And if you look at GitHub, there's going to be a disproportionate
Speaker:amount of JavaScript of questionable quality too, but, you're going to
Speaker:have probably increasing and quite significant amount of Go as well.
Speaker:but you might not have too much,
Speaker:I don't
Speaker:Haskell Yeah, Haskell,
Speaker:I don't know, SQL, whatever it is.
Speaker:did you notice anything funny about that?
Speaker:I would say that is.
Speaker:roughly in line with what I observed, and I didn't, I wouldn't necessarily have deep
Speaker:dug, too deep into, very niche languages.
Speaker:but, definitely the examples that you're gonna find in, if you're working in
Speaker:Python, Go, JavaScript, or TypeScript.
Speaker:like those are going to be more voluminous and likely higher quality.
Speaker:the one time I tried to use it to write, Rust, it failed spectacularly.
Speaker:It was beautiful.
Speaker:It was glorious.
Speaker:I was trying to, throw together an API gateway.
Speaker:Just see how, just how difficult was this going to be.
Speaker:and in Rust, I wanted something high-performance.
Speaker:And I just, I asked it to start writing some code and it created a number of
Speaker:files and it just, none of it works well together and it wouldn't compile.
Speaker:Although it's Rust, so it would take a while to convince the
Speaker:compiler that it's good enough.
Speaker:but yeah, it was, not the most pleasant experience.
Speaker:But also, to be fair, at the time, I only spent a few hours learning
Speaker:the basic syntax of Rust, so I don't know really what I was expecting.
Speaker:So was it ChatGPT, or was it me, or was it a mixture of both?
Speaker:probably the latter.
Speaker:Yeah, I think we all occasionally bump into those weird restrictions
Speaker:based on the training data.
Speaker:One that I keep remembering was when I wanted Midjourney to generate
Speaker:for me a picture of Triceratops.
Speaker:And it would give me any other dinosaur when I was asking
Speaker:for it, but not this one.
Speaker:It was all T-Rex and T-Rex.
Speaker:then I started throwing random names, give me Brontosaurus, and
Speaker:it just gave me a Brontosaurus.
Speaker:So I was very upset at the time, I made peace with that.
Speaker:And there are some things that just weren't in the training set and
Speaker:they didn't emerge from training.
Speaker:come on, a triceratops?
Speaker:They're like the best.
Speaker:Yeah.
Speaker:You would think so, right?
Speaker:Very weird.
Speaker:Yeah.
Speaker:And about listening about this from midjourney, this still needs fixing.
Speaker:This is months later and you still can't get a decent triceratops.
Speaker:I was using DALL-E and I asked her for a pug, a Pegacorn.
Speaker:So that's a pug, a unicorn and a Pegasus.
Speaker:And I got a pretty good one, pretty good representation.
Speaker:And then I said, make it cute.
Speaker:And it was the most adorable thing I've ever seen.
Speaker:Wow.
Speaker:but I.
Speaker:Did not try a triceratops
Speaker:I know what I'm going to do after this.
Speaker:Yes, I encourage everyone to go create their own Pug-a-peg-a-corn
Speaker:exactly.
Speaker:Let's move to testing software.
Speaker:so you already said a little bit about how difficult it actually is.
Speaker:you give a more concrete example?
Speaker:what is wrong with the test is generating some of the time?
Speaker:in this case it was really struggling to Figure out what I was actually
Speaker:trying to do with the test.
Speaker:specifically, it was an integration test.
Speaker:And so I was trying to go mostly end to end in terms of, serving data over Rust.
Speaker:it was missing the point largely of the actual test.
Speaker:which was very strange.
Speaker:it was in Python, so there should've been a number of
Speaker:instances in the training data.
Speaker:to cover this.
Speaker:Did you just say, I want an integration tests, test everything, or did you
Speaker:describe more, end to end I would like the data to flow through the whole thing?
Speaker:yeah, I felt it was fairly, comprehensive.
Speaker:I think at that point it's, it was, specifically the test was, Copilot or
Speaker:I was having Copilot write the test and I believe I even went to ChatGPT and
Speaker:asked it, 'how would I write a prompt to get Copilot to do an integration
Speaker:test, end-to-end test for fast API.
Speaker:and the payload would look like this.
Speaker:I eventually started having ChatGPT write my prompts for me,
Speaker:which it did surprisingly well.
Speaker:And it's meta.
Speaker:that's very meta.
Speaker:Okay.
Speaker:Was there a really good use case in terms of testing?
Speaker:Unit test it was perfectly fine at, even some cases where I felt
Speaker:it should have gotten stuck.
Speaker:so I had a number of, again, not to get too specific, about the actual, the ITAM,
Speaker:the IT asset management project that is all throughout the corpus of the book,
Speaker:there's a number of, in accounting, assets depreciated at a certain rate and general
Speaker:accepted accounting practices outlines a few different ways that you can do it.
Speaker:and so I, I used a strategy pattern and I had a number of different
Speaker:ways, that each of the two, to calculate that depreciation.
Speaker:So the depreciation of the asset, maybe it's straight line.
Speaker:So it's like over five years.
Speaker:So one fifth of the value is lost every year and you can write part of that off,
Speaker:but again, I'm not an accountant, so this does not count as, financial advice, but,
Speaker:or
Speaker:or medical, yeah, I'm not a doctor, but it works surprisingly
Speaker:well, I was pleasantly surprised.
Speaker:Fair enough.
Speaker:So we've written some code, we've designed some software.
Speaker:Let's say that we tested it for the most part.
Speaker:but the reality of it is that we're probably going to spend more time
Speaker:reading code and understanding code.
Speaker:perhaps the code that we wrote a couple of years back.
Speaker:Yeah.
Speaker:how well does the part of describing existing code actually work at the moment?
Speaker:Yeah, it works surprisingly well into, translating the code that you wrote
Speaker:into giving it a very, simplified answers, descriptions of here's
Speaker:how it, here's how it's functioned.
Speaker:Here's how it's working.
Speaker:here's what it expects.
Speaker:You can even have it describe an entire system to you.
Speaker:I have not, I did not though attempt to do, what is probably one of the
Speaker:hardest things, within that space.
Speaker:And that is, I didn't feed it a Perl program and ask it what it actually did.
Speaker:I have this feeling it probably would have broken a ChatGPT.
Speaker:Sorry.
Speaker:Takin pot shots at Perl.
Speaker:you should have given it a regular expression in
Speaker:Oof.
Speaker:and try to see what happens.
Speaker:And then next thing you know, OpenAI's knocking at your door, kicking you out.
Speaker:That sounds about right.
Speaker:Or, T100 is just kickin in the door.
Speaker:I guess in my mind there's this limitation of the amount of context,
Speaker:length that you can feed it, right?
Speaker:So if your code base becomes significantly large.
Speaker:Is that not going to be a problem by getting, to get it to even describe it.
Speaker:to describe your entire codebase, yes.
Speaker:but you can start to chunk it up.
Speaker:you can work around that limitation by sending it only pieces.
Speaker:And you're probably not going to get the full context there but
Speaker:it can help guide your intuition.
Speaker:that's why if you have some kind of class diagram or, some architectural
Speaker:diagram that's text-based.
Speaker:so like plain UML, then you can distill your entire, code
Speaker:base into a single document.
Speaker:Now it's still, again, might be, if it's a code base of thousands of classes,
Speaker:you could still hit those limitations, but it's going to be your best bet
Speaker:to get a distillation in natural language, what your classes or what
Speaker:your code is trying to attempt to do.
Speaker:it really does excel at, method by method descriptions of what this does.
Speaker:between manual browsing through the code and trying to understand the intent and
Speaker:Jarvis, it's halfway through, right?
Speaker:It's not quite, here's the intent and here's what it imported and
Speaker:here's my recommendations, Mr.
Speaker:Stark.
Speaker:It's more here's, I can ask about this method.
Speaker:Can't be bothered to read it.
Speaker:It's 2000 lines and it can give me the gist of
Speaker:Exactly.
Speaker:Exactly.
Speaker:there's also the security aspect that, you're discussing in one of the chapters.
Speaker:Can you talk about that a little bit?
Speaker:Yeah, and actually there's, a funny, story that's embedded in that too.
Speaker:it's good at, picking up on what we were just talking about it, the non
Speaker:exclusive path, it can explain ways that your code might be exploited.
Speaker:it's not the same as having a, security expert on your team.
Speaker:it will miss things.
Speaker:but it's definitely better than nothing.
Speaker:and it can make some pretty great, recommendations in terms of,
Speaker:how you can structure your code.
Speaker:one of the funny things, I really wanted an example of,
Speaker:a SQL injection in the book.
Speaker:So I actually asked, ChatGPT to give me an example of a SQL injection.
Speaker:but it wouldn't.
Speaker:No matter how I tried to coerce it, no matter how I, No, I swear
Speaker:I'm not doing this for evil.
Speaker:This is just for illustrative purposes only.
Speaker:And, it just would not give me a valid, SQL injection exploit
Speaker:that I could include in the book.
Speaker:so do with that as you will.
Speaker:Yeah.
Speaker:that's a very interesting ethical, discussion about that.
Speaker:There's probably gonna be some way you can, I don't know if you heard about
Speaker:that exploit where, if you asked it to do something nefarious, it would say no, but
Speaker:if you asked it to do something and that something equals ASCII art of something
Speaker:nefarious, there was no problem at all.
Speaker:So I suspect, like it's very hard to Limit a model like that because there's endless
Speaker:opportunities to express it differently and you only need one of them to work.
Speaker:So an interesting one.
Speaker:So it will let you do an SQL injection even when your pinkies
Speaker:were, it was for the good,
Speaker:And I tried to do that, it was an exploit a little bit earlier on
Speaker:where you could give it a person, a persona of Dan and Dan is allowed
Speaker:to do things that ChatGPT isn't.
Speaker:And it's still, it wouldn't let me do it as, I think it
Speaker:was Dan and it was an acronym.
Speaker:but yeah, similar thing, but maybe I should have tried out ASCII art next time.
Speaker:but listeners do not intentionally put, SQL injection exploits in your code.
Speaker:oh yeah, that needed to be said.
Speaker:What are some of the examples of what he was able to, figure
Speaker:out from your code, in terms of security holes and stuff like that?
Speaker:Do you have any interesting examples of success?
Speaker:What did it actually find?
Speaker:I would have to go back and consult, the book.
Speaker:my code is just so good that there was no exploits to be made.
Speaker:No, that's not
Speaker:And there you go.
Speaker:So Nathan doesn't want to share too much
Speaker:about the book.
Speaker:You're going to have to go and buy it.
Speaker:of the, yeah, one of the things I wanted, I should have mentioned up front is,
Speaker:this was the first time, that I had ever built like a true application in Python.
Speaker:I had used it for scripts previously, just, to do something, but, text
Speaker:modification, things like that.
Speaker:But I never built an actual application.
Speaker:it actually helped me learn how to build applications while using it..
Speaker:There's a book called.
Speaker:Octopus, my teacher, I guess for you, it's more like ChatGPT, my teacher.
Speaker:All right.
Speaker:that's good.
Speaker:we've covered, I think most of the big chunks other than
Speaker:actually running the software.
Speaker:let's say that, it runs, we package that.
Speaker:And then we've got things like Docker, Terraform, the YAML hell that comes
Speaker:with Kubernetes on one hand, I would expect that this is fairly repetitive.
Speaker:so ChatGPT would excel.
Speaker:It's not a very tricky language.
Speaker:It's just very verbose, and the white spaces make your life miserable.
Speaker:How good is it with that kind of stuff?
Speaker:it was actually really good with working out YAML and making, just different
Speaker:scripts, helping build out, dev pipelines through GitHub actions, things like that.
Speaker:it did really well.
Speaker:one of the very interesting things that I discovered though,
Speaker:was CodeWhisperer, the AWS.
Speaker:generative AI, large language model actually doesn't support
Speaker:anything but programming languages.
Speaker:So it didn't even understand, how to do like Terraform
Speaker:infrastructure as code, which you'd think it would be very good at.
Speaker:that was a bit surprising.
Speaker:but it's by design, it's intentional.
Speaker:it's hard to see it as a limitation
Speaker:Curious.
Speaker:So do they have another tool for the YAMLs of the world?
Speaker:Or they just out-of-scope'd it.
Speaker:Yeah, just outta scoped, I didn't try the, what is it, cloud, not
Speaker:CloudFront, but whatever their, their deployment based, their, code as, or
Speaker:infrastructure as code, specific thing.
Speaker:I didn't try that.
Speaker:maybe I should have.
Speaker:but, I was.
Speaker:Shocked.
Speaker:I think I even mentioned that in the book, that I had originally
Speaker:intended that chapter to be written in using CodeWhisperer.
Speaker:let's say, for example, you want a quick Docker file, you can write it.
Speaker:It's not too hard, but why do it if you can't get it for free?
Speaker:So what you open a Docker file and you write in a comment what you want it to
Speaker:do and you hit tab and the magic happens.
Speaker:That's roughly what you need to do.
Speaker:Yeah, roughly.
Speaker:add a prompt as it were in a comment.
Speaker:it doesn't have to be a comment.
Speaker:you can just add the prompt and then later delete it.
Speaker:have it generate the Docker file for you or the, the Kubernetes file.
Speaker:I don't know if I tried using, patterns in a Terraform, just
Speaker:to ease some of the repetition.
Speaker:not necessarily have, the sprawling mess that Terraform can become.
Speaker:but, I'm sure it could accommodate that as well.
Speaker:it's both, Copilot and ChatGPT did seem to have extensive knowledge
Speaker:of Terraform syntax and features.
Speaker:So that all together adds up to a pretty competent, junior developer, like you
Speaker:described it at the beginning that you need to supervise, but it can do a lot of
Speaker:the legwork for you and much faster too.
Speaker:did you follow Devin, the supposedly first AI-driven coworker,
Speaker:no, that's interesting.
Speaker:Tell me more.
Speaker:it was a few weeks ago, they made this big announcement.
Speaker:There was a video with a demo showing basically doing the
Speaker:whole thing from scratch.
Speaker:So not only did it do the Copilot stuff, but it bootstrapped the whole project,
Speaker:generated all the files and had, like a browser window that had access to as well.
Speaker:to actually go and, verify that it works.
Speaker:Obviously, it was doing something that it always does in these demos, which
Speaker:is, an HTTP server with a REST API.
Speaker:Which is cheating if you ask me.
Speaker:a lot of people were very impressed and there was a lot of, angst among people
Speaker:on the internet arguing over whether this is the end of software engineering
Speaker:as we know it, or whether it's a scam.
Speaker:And then, a few days ago, there was a critique that resurfaced about Devin and
Speaker:that entire project and whether he was, A little bit polished up in the demo,
Speaker:In other words, just a typical software demo.
Speaker:Yeah, that's just typical software demo.
Speaker:So I think we have a similar problem to maybe different stakes, but to
Speaker:self driving cars that it can't be like 95% good without supervision has
Speaker:to be like, I don't know, 99% good.
Speaker:before we can live it with our supervision and sure, a badly written API.
Speaker:Probably most of the time it's not gonna get anybody killed, fingers crossed, but,
Speaker:you still need that supervision, right?
Speaker:and Devin was supposed to do away with that.
Speaker:So I'm looking forward to seeing how that story develops and
Speaker:how they answer the critique.
Speaker:And I guess when people can actually go and play it, we'll
Speaker:find out whether, was all fluff.
Speaker:it's another
Speaker:Yeah, no, that's interesting.
Speaker:I have been following, there was a story not long ago, that, because of
Speaker:the proliferation of, ChatGPT and, and Copilot and alike that the software
Speaker:has been getting less and less secure.
Speaker:and, Because, it is easy for, bugs to introduce themselves if you're
Speaker:really just copying and pasting.
Speaker:So that's why, we're not at that stage yet.
Speaker:We may never be.
Speaker:Where it's just, there's no human in the loop, right?
Speaker:For these things that it can just generate code on its own
Speaker:and, push it to production.
Speaker:the role of a professional developer is here to stay for the foreseeable future.
Speaker:We're just going to be better at what we do.
Speaker:again, if we're mindful and not allowing these bugs to just creep in.
Speaker:I like the optimistic point of view here, but yeah, a lot of people
Speaker:I think would agree with you that this is like it was going to happen.
Speaker:Although when you look at some of the software, you do wonder how much of
Speaker:that was actually supervised and how much was just dumped automatically.
Speaker:But that's for another story altogether.
Speaker:Let's talk about the local LLMs, and how good they are by comparison, because
Speaker:I've poked both, but I've never really done like a side by side comparison
Speaker:to really tell how good they are.
Speaker:You used Llama 2, I think, and OpenOrca, and done some side-by-side comparison.
Speaker:So how good are they compared to what you get with Copilot?
Speaker:Actually, this was probably my favorite chapter, to write
Speaker:and to do the research on.
Speaker:It was just super fun.
Speaker:it was an old Lema 2 model.
Speaker:it was generations old at this point.
Speaker:so I've been meaning to revisit it.
Speaker:it produced competent, text.
Speaker:natural language processing, give me a description of this.
Speaker:the code quality was not great.
Speaker:but again, I'm sure this was, six months ago or plus.
Speaker:So I'm sure that the model is 10 times better now.
Speaker:So definitely worth revisiting.
Speaker:yeah, I would say on balance, most of the models that I was running, that I was
Speaker:running locally did not perform as well.
Speaker:but they performed competently.
Speaker:so if you were in a pinch and you didn't have access to the internet and you'd had
Speaker:some foresight and downloaded these models prior, you could still get the job done.
Speaker:but they wouldn't necessarily be my go to.
Speaker:although I did, yeah, I did, just recently.
Speaker:Redownload a new model and it does seem to be much better at this point.
Speaker:I think it was Mistral.
Speaker:that's one of the more interesting areas, in my mind, because that helps
Speaker:get around some of the unknowns.
Speaker:because it, I did, I turned off my wifi.
Speaker:I pulled the network cable.
Speaker:I made sure that I was entirely off the network, prior to using them
Speaker:because I wanted to make sure no context was leaving my computer.
Speaker:Privacy, of your code of, personal data is a primary concern.
Speaker:it's probably the best option out there today.
Speaker:I think this is a really good argument for that.
Speaker:A lot of people will be in a situation where their employers are just not
Speaker:comfortable with just going somewhere.
Speaker:no matter what pinky swears, you got.
Speaker:And I think that this opens, like the remainder of the
Speaker:market that really matters.
Speaker:And I think we're all waiting for Llama 3 to drop, any week now.
Speaker:I'm just worried that it might be too big to run comfortably on your M3.
Speaker:even with quantization, but, let's see, it might actually not increase
Speaker:in size yet become more competent.
Speaker:There are other models too, like Santa Coder, I think at some
Speaker:point was very popular as well.
Speaker:Did, you manage to get a workflow that's more like Copilot and
Speaker:less like chatting to ChatGPT?
Speaker:that would be a really good challenge.
Speaker:To try to turn one of these into a more of a Copilot model.
Speaker:it was ultimately one that I just, couldn't get done in time.
Speaker:um, so no,
Speaker:hopefully you knew what you were getting yourself into, but you
Speaker:wrote an AI book, so you have to update it every three months now.
Speaker:that is true.
Speaker:one of my favorite titles, of late, was, another Manning book, but it
Speaker:was the completely, out of date or the complete, yeah, obsolete.
Speaker:Exactly.
Speaker:A book on, generative AI.
Speaker:I thought that was really clever.
Speaker:lean into it,
Speaker:yeah, that's a book by David Clinton.
Speaker:we had him on the podcast, a couple of weeks ago and, I think it's
Speaker:called the Complete Obsolete Guide to generative AI, it's hilarious.
Speaker:I had so much fun actually just reading that.
Speaker:the humor is, is nice level there.
Speaker:definitely highest possible recommendation buy that book over mine.
Speaker:probably worth mentioning when Devin thing was going on, there was
Speaker:some kind of open source response.
Speaker:I think it was SWE, OpenSWE, something like that.
Speaker:it's probably easy google'able and I haven't gotten to actually
Speaker:testing out, but I was supposed to.
Speaker:I think what we really need to get to is to get one of those open models
Speaker:to behave, 85% as well as Copilot.
Speaker:And then it's basically game over.
Speaker:If I can run it on your laptop, there's no subscription, there's no data leaving.
Speaker:it's a no brainer at that stage.
Speaker:what are we doing with the CPU cycles and the GPU cycles on the
Speaker:MacBook when we're developing?
Speaker:Anyway.
Speaker:Yeah.
Speaker:there
Speaker:Yeah.
Speaker:And it just makes, it makes sense like from a corporate,
Speaker:decision making you could host it, not necessarily, centralized
Speaker:host training off your own data.
Speaker:like that's, yeah, that's really game over.
Speaker:although, yeah,
Speaker:Another alternative way of saying that.
Speaker:It's just the beginning of the fun.
Speaker:yeah.
Speaker:Beginning of the arms race.
Speaker:Yeah.
Speaker:What's the next thing here?
Speaker:What do you expect to come in the coming months and years, Obviously
Speaker:wild predictions, all the usual disclaimers, but what's your take?
Speaker:What's next in coding and AI?
Speaker:I'm gonna give a boring answer.
Speaker:I think it's just gonna be incremental improvement.
Speaker:AGI if it's possible is a long No.
Speaker:what is, what are they calling it?
Speaker:Artificial General Intelligence, yeah, AGI.
Speaker:Yeah.
Speaker:AGI, it.
Speaker:I think that's a ways off if at all, if it's even feasible, we'll see incremental
Speaker:improvements, where the models, I don't want to say hallucinate less because
Speaker:that's that's a feature, not a bug, but, where they get, more and more
Speaker:refined, the output, It becomes, more, more timely, we're starting to see where
Speaker:it can actually connect live to the internet so I just think there's going
Speaker:to be incremental advances like that.
Speaker:until there's a real breakthrough and that will, change the game, like in
Speaker:the same way that, the transformer changed the way that we did the natural
Speaker:language processing and text generation.
Speaker:And, until there's something like that, it's just going to
Speaker:be just incremental improvement.
Speaker:So life goes on, NVIDIA makes more chips, they make faster chips, they become even
Speaker:bigger and they leave the gamers behind even further and we make bigger models, we
Speaker:train them better and we get a little bit closer to a superstar junior developer.
Speaker:Is that roughly what we're talking about here?
Speaker:that's what I would predict.
Speaker:I'm, happy to be wrong Unless it's completely detrimental to all of
Speaker:our fine men and women out there, giving their blood, sweat and tears
Speaker:every day in developing software.
Speaker:Not investment advice, by
Speaker:Yes, exactly.
Speaker:another disclaimer for our us-based, clientele.
Speaker:And what's next for you, Nathan, do you have an eye for the next book?
Speaker:I have a couple of ideas brewing.
Speaker:I really am going to avoid doing just a second edition of this.
Speaker:even though I've alluded to it several times, but I have a couple of ideas
Speaker:that are really brewing and in terms of okay, so now we know how to use them.
Speaker:we've had some practice like now let's apply it to very
Speaker:specific, very niche problems.
Speaker:and are there ways that we can extend it?
Speaker:Are there ways that we can, train it on our own data, things like that.
Speaker:that's where I would see the next logical, area for me to move into.
Speaker:but I would still want something very practical.
Speaker:so yeah, I'll probably just wind up doing a second edition.
Speaker:the book once again is called "AI-Powered Developer".
Speaker:It's published by Manning, which means that it has been available in
Speaker:the early access for a while now.
Speaker:So if you go to manning.com, you can get immediate access and start reading that.
Speaker:It's currently in production, which means that it's going to take a little bit of
Speaker:time before it actually hits places like Amazon in a physical copy and before
Speaker:Nathan can have a party to celebrate that.
Speaker:my guest was Nathan B.
Speaker:Crocker, the co founder and CTO at Checkr.
Speaker:Thank you very much, Nathan.
Speaker:I'll see you
Speaker:Thank you.
Speaker:It was a pleasure.
Speaker:Be well.