Login to participate
  
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

Using AI Bots for Programming - Sun, 16 Mar 2025

I’m working hard on Version 2 of Behold, and I needed to write some code to strip out HTML from a string. This would be so Behold could display the text embedded in the HTML correctly without formatting.

Since I’m a Windows and Office 360 aficionado, I get a version of Microsoft’s AI bot named Copilot built into my internet browser Edge and my Office products. So that’s the one most convenient for me to use. And I have had good luck asking Copilot programming questions to help me with small tasks in my programming language Delphi in the past.

But this time, this was a task I really didn’t want to do myself. I saw the solution as scanning through the string character by character and parsing out what was html to remove it. Not too difficult, but a bit of a pain to code. So I thought I’d let Copilot try.


Query Number 1

image

uses
  System.RegularExpressions;

function RemoveHTML(const Input: string): string;
var
  RegEx: TRegEx;
begin
  // Create a regular expression to match HTML tags
  RegEx := TRegEx.Create(’<[^>]+>’);
  // Replace all HTML tags with an empty string
  Result := RegEx.Replace(Input, ”);
end;

// Example usage:
var
  Original, Cleaned: string;
begin
  Original := ‘<html><body>Hello, <b>World</b>!</body></html>’;
  Cleaned := RemoveHTML(Original);
  WriteLn(Cleaned);  // Output will be "Hello, World!"
end.

image

Well that was a surprise to me. A very simple two line function. It uses a mathematical notation called a “regular expression” to find the HTML bits. In all my programming days, I’ve never got to understand them well as I find that once you write one of those regular expressions, such as the “<[^>]+>” that Copilot created, then it is almost impossible to reverse engineer what it does and how it works.

I was a great fan of the APL programming language, and it had the same feature of being impossible to figure out after the fact.

Nonetheless, Copilot created me some code that I could use even if I didn’t fully understand how it worked.


Query Number 2

I wanted a slight modification to that. HTML can include urls in its HREF values, and that was information I ddn’t want thrown out, so I then asked Copilot:

image

uses
  System.RegularExpressions;

function ReplaceHrefTags(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
begin
  // Regular expression to match href tags and extract href value
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>’);
  // Replace href tags with just their href values
  Result := RegExHref.Replace(Input, ‘$1′);

  // Regular expression to match all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  // Remove other HTML tags
  Result := RegExTags.Replace(Result, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Visit <a href="http://example.com">this link</a> for more information.</body></html>’;
  Processed := ReplaceHrefTags(Original);
  WriteLn(Processed); // Output will be: "Visit http://example.com for more information."
end.

image

Excellent!  Just one more thing.


Query Number 3

image

uses
  System.RegularExpressions;

function ProcessHTML(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
begin
  // Regular expression to match <a> tags with href attributes and capture both href value and inner text
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>(.*?)</a>’);
  // Replace <a> tags with "href value (inner text)"
  Result := RegExHref.Replace(Input, ‘$1 ($2)’);

  // Regular expression to match all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  // Remove remaining HTML tags
  Result := RegExTags.Replace(Result, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Visit <a href="http://example.com">this link</a> for more information.</body></html>’;
  Processed := ProcessHTML(Original);
  WriteLn(Processed); // Output will be: "Visit http://example.com (this link) for more information."
end.

image

I am rather impressed. I ended up with the code that I wanted after only 3 queries to Copilot. And it provided me with the code, an example, and an explanation of how it worked.

I know it would have taken me considerably more than 5 minutes to write that query myself without the use of regular expressions.

Thank you Copilot! Maybe you can’t completely translate my Russian documents for me yet, but it looks like you could be very useful to me for my programming.


Just One More Thing

While testing this code on some of my data, I found one more revision was needed:

image

uses
  System.RegularExpressions;

function ProcessHTML(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
begin
  // Regular expression to match <a> tags with href attributes and capture both href value and inner text
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>(.*?)</a>’);
 
  // Replace <a> tags based on the condition
  Result := RegExHref.Replace(Input,
    function(const Match: TMatch): string
    begin
      if Match.Groups[1].Value = Match.Groups[2].Value then
        Result := Match.Groups[1].Value  // Include only href value
      else
        Result := Match.Groups[1].Value + ‘ (’ + Match.Groups[2].Value + ‘)’;  // Include href value and text
    end
  );

  // Regular expression to match and remove all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  Result := RegExTags.Replace(Result, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Check out <a href="http://example.com">http://example.com</a> and <a href="http://another.com">this link</a>.</body></html>’;
  Processed := ProcessHTML(Original);
  WriteLn(Processed); // Output will be: "Check out http://example.com and http://another.com (this link)."
end.

image

But this time there was a problem. The code would not compile.

Would Copilot be able to fix it?

image

uses
  System.RegularExpressions;

function ReplaceHrefTags(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
  Matches: TMatchCollection;
  Match: TMatch;
  Processed: string;
begin
  Processed := Input;

  // Regular expression to match <a> tags with href attributes and capture both href value and inner text
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>(.*?)</a>’);
  Matches := RegExHref.Matches(Processed);

  // Process each <a> tag match
  for Match in Matches do
  begin
    if Match.Groups[1].Value = Match.Groups[2].Value then
      Processed := StringReplace(Processed, Match.Value, Match.Groups[1].Value, [rfReplaceAll])
    else
      Processed := StringReplace(Processed, Match.Value, Match.Groups[1].Value + ‘ (’ + Match.Groups[2].Value + ‘)’, [rfReplaceAll]);
  end;

  // Regular expression to match all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  // Remove all remaining HTML tags
  Result := RegExTags.Replace(Processed, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Check out <a href="http://example.com">http://example.com</a> and <a href="http://another.com">this link</a>.</body></html>’;
  Processed := ReplaceHrefTags(Original);
  WriteLn(Processed); // Output will be: "Check out http://example.com and http://another.com (this link)."
end.

image

And that worked!

Very well done Copilot.

2 Comments           comments Leave a Comment

1. trevithick (trevithick)
France flag
Joined: Mon, 24 Mar 2025
1 blog comment, 0 forum posts
Posted: Mon, 24 Mar 2025  Permalink

I am happy to hear that v2 is still in the works.

I own Office 2021 which you would think would incorporate CoPilot, but it does not. You must use Office 365.

A while back, I was playing with some Python code on my Raspberry Pi, and could not figure it out so I asked ChatGPT to code it for me. After about four iterations, it worked, and I was able to see my previous errors.

2. Louis Kessler (lkessler)
Canada flag
Joined: Sun, 9 Mar 2003
291 blog comments, 245 forum posts
Posted: 6 days, 8 hrs ago  Permalink

Check the “Future” page for the status of v2.

I believe Copilot is available for free in the Microsoft Edge browser, even without an Office 365 subscription.

Leave a Comment

You must login to comment.

Login to participate
  
Register   Lost ID/password?