When Your Tools Are Timeless, So Are You

As a Subject Matter Expert (SME) at MOZ and an advocate of Linux, Python, vim & git, I am passionate about finding ways to use these technologies in my workflow. I am exploring fringe tech that should be mainstream, promoting Linux, Python, vim, and git as a minimum toolchain. I am also discussing the importance of language and how it helps machines think like us, as well as the Microsoft taxes that cannot be avoided.

Using Linux, Python, Vim & Git to Automate Web Browsers and Become a Technologically Literate Citizen

By Michael Levin

Thursday, April 6, 2023

Not every journaling session needs to be a live-streaming session. Today, this morning, I’m going to just think by myself and for myself… though some of my thoughts may still be auto-inserted by Github Copilot AI.

First, I’m going to list what’s good about what I’ve done so far at MOZ. I’m one of 3 subject matter experts (SME’s), and I’m a huge advocate of Linux, Python, vim & git. I’m also a huge advocate of the “DevOps” mindset by which you script frequently and automate everything you can.

As such, I’m a fan of APIs and Browser Automation, which is still not yet that common among the technical population, and even less-so among SEOs. Browser automation has been traditionally fragile and difficult to maintain, but the times are changing. They’re changing for a reason that’s hard to articulate.

The Java programming language promised write-once run-anywhere, but it didn’t quite work out that way. Java ended up feeling like something “external” to your system that had to be kept furiously updated and was a security risk to boot, so it got lumped with Adobe Flash as a sort of “evil” thing to keep off your system. It’s important to note that Java is not JavaScript. Java came first and the name and some syntax of Java were lifted for JavaScript to help boost its popularity. But Java didn’t fulfill its promise and JavaScript ran poorly in inconsistently on both browsers and servers.

Then along came NodeJS to fill that vacuum by allowing JavaScript to run extremely well on the server, thus “unifying” browser and server code, and very much appealing to Web developers who were already familiar with JavaScript.

JavaScript then started playing the role of interoperable language among the cool kids. Today, NodeJS is a JavaScript runtime that runs on pretty much any modern desktop or server, and because it’s also the exclusive language of Web browsers, it created this belief in a full web stack to rule them all. NodeJS is a great tool, but it’s very specific to Web-things.

Meanwhile, all the “boring stuff” was being handled in other ways. A move off of the previously popular DevOps language of PERL was underway, and all the Unix/C folks were moving to Python.

Python is a general purpose programming language that is also very popular for data science and machine learning. It’s different from JavaScript in that it’s not a web language, but it’s also not an expensive vendor-laden “enterprise” language like Java or .NET, though it can be used for those things too. Python is also the great unsung hero, powering a surprising amount of the Internet.

There’s a deep story here about how Python came to be so useful across so many problem domains, so accessible to newcomers, yet so powerful and flexible that it’s used by the world’s largest companies. It starts in 1991 with Guido van Rossum, a Dutch programmer who was working at Centrum Wiskunde & Informatica in Amsterdam; the equivalent to the US National Security Agency.

Guido was looking for a hobby programming project that would keep him occupied during the week around Christmas. His office would be closed, but he had a home computer and he decided to write an interpreter for the new scripting language he had been thinking about lately: a descendant of ABC that would appeal to Unix/C hackers. He also wanted it to be extensible so that he could add programming modules to do system administration tasks.

It may not be obvious from reading such a short description, but you have to understand that the ABC language was very much designed for teaching, and people who did not yet know how to program. It was a very simple language, and it was designed to be easy to learn. This is about as oppposite to the Unix way of doing things as you can get. Unix was designed for programmers, and it was designed to be obtuse and full of inside-jokes right down to it’s name Unix which is a combining the “Uni” for single-user (versus the precursor Multics), in addition to sounding like both Multics and Eunuchs (chopping balls off).

Guido’s goal was to create a language that was a descendant of ABC, but that would appeal to Unix/C hackers. This rather amounted to trying to breed an elephant and a mouse. Normally this results in some sort of by-committee Camel, which appropriately became to be the logo for PERL, another language around the same time. But whereas PERL’s philosophy didn’t help it in the long-run, Python’s philosophy did.

Today, Python has grown to be a viable choice for almost any programming task, with the exception of those areas where a highly specialized language for the task exists, such as JavaScript for Web development, Swift for iPhone mobile development, Google’s GoLang for fast compiles of static binaries supporting concurrency, and Rust for all the very low-level you might have used C/C++ for in the past but want some safety and memory management C doesn’t provide by default.

It also needs to be mentioned that C is something like the grandparent to most modern languages because the digital computer revolution beyond Windows, Macs and the early breed of “home computers” were all powered by some form of Unix by the 1990s. Unix was written in C, and C was the language of choice for most game developers, and anywhere high performance was required like in drivers and Photoshop. However, C is not for the feint of heart. Taking up C is a career choice, not a casual thing.

Python can be taken up as a casual thing, and all those advantages that used to be touted for Java are now in Python and pretty much all popular modern languages. Whatever you call it, it’s got something like a virtual machine or VM, which is a program that runs other programs. Once upon a time it would have been called an “interpreter”, but that’s not really accurate anymore because of a pre-compiled bytecode that’s generated and then run by the VM. Everything that can’t be pre-compiled is handled at run-time, so languages like Python and JavaScript optimized to run fast.

This virtual machine layer in languages makes the language itself an excellent vessel for code portability. If you write your code for CPython or NodeJS, your Python or JavaScript is pretty sure to be able to run on any other host hardware and OS such as Windows, Macs or Linux with little to no modification. This achieves much of the dream of Java, but not all of it.

If it wasn’t the programming languages with now-embedded VMs of their own that solved the portable interoperability problem, then what is it? And why is portable interoperability so important? The answer to the former is Linux, and the answer to the latter is the Internet. Linux combined with Python ensures that all the slashes in your paths slant the correct direction and the places these paths point to are conveniently named the same thing on all platforms.

Now when you’re looking for conventional locations for resources on your system like the home directory, the place to save temporary files, the place to save files relative to your current location, etc. you’re not going to find much better than Unix and Linux for this. It’s not perfect, but it’s pretty good. Combining Python with Linux provides all the power and all the portability you need to write code that can run on any platform, and serve you the rest of your life.

Tech changes fast, but not all aspects of tech. If that were the case, the most expensive and important parts would need to be pulled out and replaced too often. Hardware lasts longer than software, and software upgrades are used as leverage to extract money from you and to force you to constantly re-train and never be as expert as the people pulling the carpet out from under you.

To keep the carpet from being pulled out from under you, you avoid falling into the fad trap. You don’t chase the latest and greatest, you chase the best and most useful. What is best and most useful is often counter-intuitive. Modern power-tools from vendors are like little magic shows that make you feel cognitive dissonance when you realize that you can do something that you thought was impossible. But the magic is just a distraction from the fact that you’re being sold a tool that’s not really that useful.

It sounds like I’m talking about turning everyone into a Luddite, but I’m not. I’m talking about being a technologically literate citizen. Tools of general tech literacy and your keys to productivity should not be held by any particular vendor. You should not be forced to use whatever the latest fads compel you to use. It’d be nice to say you should be able to use any tool you want, but down that road lies heartbreak and pain. You can avoid a lot of re-learning and re-training by getting a few “sweet spot” timeless tools in your toolbox and then learning how to use them well.

The sweet spot tools are the ones that are not too specialized, but not too general. They are the tools that are sometimes quite old, but have been battle hardened, have aged well, and are used by too many people to be abandoned or fail. They almost never cost money because they go back to far to have been monetized, or their creators have protected them with permissive licenses like GPL2 for Linux, or MIT for Python.

Often tools such as Linux, Python, vim & git are not the first tools you learn about, but they are the tools you learn about last. This is unfortunate because things you learn first are the things you learn best. They imprint on you and set down the initial layers of synaptic connections in your brain. Everything else builds-on or modifies, but rarely replaces. This is why there’s a long-running meme in tech that learning the BASIC language first causes brain damage.

The reason you learn about these tools last is because they are not the ones pushed at you by the vendors who built your laptop. Their concern is to lock you into a commercial developer ecosystem that they can monetize. If you’re on Windows, all roads lead to Azure. If you’re on Macs, all roads lead to being an Apple developer. If you go off the beaten track at all, you’re going to find NodeJS and the full web stack, which is still really Microsoft because VSCode.

VSCode is an important topic, because it may be the stickiest honeypot out there. It poses as free and open source software (FOSS) with all the benefits I attribute to Linux, Python, vim & git. But it’s not. It’s a proprietary software product that’s been built to be as sticky as possible. The purpose of VSCode is to lock you into the Microsoft developer ecosystem. There are so many plugins that will appeal to you, but then block you from leaving even though there are other more timeless ways to do the same things.

To own your muscle memory is the ultimate goal of vendors. They know that once you’ve been introduced to something, it’s hard to unlearn it. They know that if you’ve been using something for a while, it’s hard to switch to something else. And they know that when the time of obsolescence comes that has been designed into the products, you’ll be forced to upgrade or be left behind. Sometimes what once was free now costs money. Congratulations, you’ve just been slapped with the $100 developer tax. There’s a few taxes out there to watch out for:

The Microsoft Developer Tax via Github, ~$100/year
The Adobe Graphic Artists Tax via Creative Cloud, ~$600/year
The Office Suite tax via Microsoft 365, ~$100/year

There’s others, but these are the 3 that get to the fundamental skills that should not have such taxes. Now I’m a big advocate of paying for things that are worth it, and which help you in the short-term (power-tools) and which will not harm you in the long-run (vendor dependencies). That’s why I pay 2 out of 3 of these taxes. Creative Suite I can live without.

Getting started with Linux, Python, vim & git is not hard. It doesn’t even cost any money. In the past you might have had to set up a Linux machine, but now because pretty much anything can “host” a Linux VM, you’ve got Linux on Macs and you’ve got Linux on Windows. In the case of Windows, Linux officially even comes from Microsoft themselves. Can you imagine? They had to do this to maintain credibility in the open source community, such is the total win of FOSS with developers.

The biggest downside to Linux, Python, vim & git is that it’s not the default on any mainstream vendor platform. You have to go out of your way to get it. Not only that, but the initial experience with such tools is not always pleasant. It’s not visually as polished as the vendor’s own tools, but they’re far more polished in that ways that time-testing and battle-hardening can provide. You’ve got to suspend your disbelief and give them a chance, in exactly the same way you accept nonsense for a little while with such stories as Alice’s Adventures in Wonderland.

Nonsense! Maybe. But nonsense spoken long enough becomes patterns, and patterns become learnings, and learnings become abilities. And abilities become habitual muscle memory that you can perform without even thinking. It’s about the same level of nonsense as driving a car. You don’t think about how to drive a car, you just do it. You don’t think about how to use a computer, you just do it. Any computer. Any conditions. Any task. That’s the achievable dream. That’s where technical aspirations should be focused.

I’ve spoken plenty about Linux and Python, but I’ve not spoken much about vim and git. Vim is a text editor that is not for everyone. It’s not for people who want to be spoon-fed. It’s not for people who want to be told what to do. It’s for people who want to manifest language-based thought to the screen while in the state of being known as “flow”. It’s for getting into the zone and becoming one with the machine… at any time under any conditions and for the rest of your life.

One might think when things change dramatically in the world of tech, even such timeless chestnuts as vim will age and die, replaced by the newer more modern tools. Well, vim wasn’t first and it’s not even the current one I use. Vi from 1976 by Bill Joy was first but it’s still around and being used. If you learn vim, you can always use vi in a pinch. Vi is the kind of text editor you might find on a phone or embedded system, but few use it by choice anymore. Vim came along in 1991 by the Amiga Computer-using Dutch programmer Bram Moolenaar.

Today, vim is still the dominant vi-based text editor, however along comes AI. Microsoft partnered with OpenAI and quick as a bunny got a Copilot integration into VSCode. But even Microsoft knows this sort of thing can’t turn its back on the old-schoolers least they (we) really teach people how to live without Microsoft. So Copilot supports vim; but not vim directly. It actually supports NeoVim, which is a fork of vim. In almost all regards NeoVim is the same as vim, except it’s been modernized enough to support integrations with things like AI. It’s a bit how Python 2 and Python 3 are the same. In 30 years, Python got up to version 3, so seldom do it’s APIs change.

Non-changing interfaces built into applications (APIs), or at least APIs that change very little over the years is the key to longevity. It’s the key to being able to learn something once and then use it for the rest of your life. It’s the key to being able to learn something once and then teach it to others. This longevity is why I’m so passionate about Linux, Python, vim & git. This longevity of free and open source tools is the enemy of proprietary software. Even when FOSS software goes obsolete, it’s very nature is to be forked and keep itself alive. Despite advocating vim, I am non NeoVim. Same thing.

Some might say that vim is a bit too much for beginners. I say that’s a good thing. It’s a bit like saying that learning to drive a car is too much for beginners. It’s a bit like saying that learning to use a computer is too much for beginners. If you want to remain in a state of beginner, then you’ll remain in a state of beginner. If you want to become a master, then you’ll become a master. It’s that simple. Thing worth doing sometimes have a bit of friction. That’s how you know they’re worth doing.

Did somebody say git? Oh that Linus Torvalds, what a git! He’s the guy who wrote git in 2005. He’s the guy who wrote Linux in 1991. Linus is a very influential git. And modest as he is, he named both Linux and git after himself. This is appropriate because in order to continue to manage Linux development after the pricing of the proprietary distributed version control system called BitKeeper changed it’s licensing terms, Linus had to write his own. And he did. Sorry, BitKeeper, but Linux is one badass git.

And so the world changed again. The world of distributed version control was upon us and what couldn’t it do? It could do everything that was needed after you had your OS, language and text editor (Linux, Python & vim). All that was left was to backup your code so you had infinite undo. All that was left was bundling that entire code history into every backed-up instance so that any instance could become the new main instance. All that was left was robust network support so you can transmit and share your code around for both collaboration and deployment. All that was left was performed by git.

Other stuff came before git and other stuff came after git. However, git was the first free and open source distributed version control system that was so broadly popularized and distributed by the maker of Linux itself, so it won. It won over CVS, Subversion, Mercurial, Bazaar, and many others.

——————————————————————————-e

Wed Apr 05, 2023

Figuring Out How to Automate Browsers Better

So many things begin with idea capture. Capture however you can, but not always and compulsively. Good ideas will reoccur, but not so much that you don’t have to really stop and stop and capture at least once.

Things line SimpleNote (multi-platform, just text and fast!)
Lacking that, sending emails to yourself is good too.
Point being, JUST CAPTURE

I did 2 live-streaming sessions during the process of this journal entry.

Github Copilot AI Pair Journaling

Sure, AIs have Bias but will they Bias YOU?

This one actually turned out to be more about the Microsoft Playwright codegen tool for browser automation.

How to do browser automation better, in particular, Chrome automation with a real user profile and persistent data.

What’s new with browser automation? Why is Microsoft Playwright such a big deal now, and what is Google Puppeteer? Why does the Puppeteer/Playwright combination (new browser automaters) change the game? Why are people not taking up Selenium anymore even though it was so much longer-lived, popular API and widely supported?

The 3 big choices for web browser automation.

Selenium is a Java API, and Java is a big deal in enterprise, but not so much in the startup world. It’s true. The times I’ve used Selenium, I’ve found myself having to install Java. Unpleasant for lightweight dev environments.
Puppeteer is a NodeJS API, and NodeJS is a big deal in the startup world. It’s true. The times I’ve used Puppeteer, I’ve found myself having to install NodeJS. Unpleasant for lightweight dev environments.
Playwright has many APIs including Python. You are not forced into JavaScript for browser instructions, though it’s awful close. Playwright is also Microsoft “lifting” the Google Puppeteer code and making it better.

My goal is to perform browser automations more simply.

Let’s see what Bing and Bard think:

Mike Levin: What is Microsoft Playwright?

Bing: Microsoft Playwright is a newer, open-source, cross-browser automation library for end-to-end testing². It is built by Microsoft and allows testing Chromium, Firefox and WebKit with a single API⁴. Playwright is built to enable cross-browser web automation that is evergreen, capable, reliable and fast⁴. It provides reliable end-to-end testing and cross browser testing for modern web applications¹. Playwright is a Node.js library that can be used by developers writing JavaScript code to create new browser pages, navigate to URLs and then interact with elements on a page¹.

Source: Conversation with Bing, 4/5/2023

(1) Microsoft Playwright Testing Guide and Tutorial | Test Guild. https://testguild.com/what-is-microsoft-playwright-js/ Accessed 4/5/2023.
(2) GitHub - microsoft/playwright: Playwright is a framework for Web .... https://github.com/microsoft/playwright Accessed 4/5/2023.
(3) What Is Playwright – A Tutorial on How to Use Playwright - LambdaTest. https://www.lambdatest.com/playwright Accessed 4/5/2023.
(4) Use Playwright to automate and test in Microsoft Edge. https://learn.microsoft.com/en-us/microsoft-edge/playwright/ Accessed 4/5/2023.

Mike Levin: In what languages can you automate browsers through Playwright?

Bing: Microsoft Playwright started off as a javascript-based library, but they have since expanded to support Python, Java, and .NET.

Mike Levin: How can you recorde browser sequences for Playwright in Python asyncronous mode?

Bing: To record browser sequences for Playwright in Python asynchronous mode, you can use Playwright Recorder³. You can also use the Playwright Python library to record browser sequences. Here’s an example of how to do it:

from playwright.async_api import async_playwright
import asyncio

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        page = await browser.new_page()
        await page.goto('https://playwright.dev')
        await page.fill('#search-input', 'Playwright')
        await page.press('#search-input', 'Enter')
        await page.wait_for_selector('.search-results')
        await browser.close()

asyncio.run(main())

This code will open a Chromium browser, navigate to the Playwright website, fill in the search bar with “Playwright”, press enter, wait for the search results to load, and then close the browser².

Source: Conversation with Bing, 4/5/2023(1) Playwright Recorder - Record tests - TestingBot. https://testingbot.com/support/playwright/recorder.html Accessed 4/5/2023. (2) Getting started - Library | Playwright Python. https://playwright.dev/python/docs/library Accessed 4/5/2023. (3) C# Playwright tutorial - browser automation in C# with Playwright - ZetCode. https://zetcode.com/csharp/playwright/ Accessed 4/5/2023. (4) How do you open multiple pages asynchronously with Playwright Python?. https://stackoverflow.com/questions/64664437/how-do-you-open-multiple-pages-asynchronously-with-playwright-python Accessed 4/5/2023. (5) Run playwright in interactive mode in Python - Stack Overflow. https://stackoverflow.com/questions/73267809/run-playwright-in-interactive-mode-in-python Accessed 4/5/2023. (6) BrowserContext | Playwright Python. https://playwright.dev/python/docs/api/class-browsercontext Accessed 4/5/2023. (7) Python-Playwright,How to start in non-incognito mode?. https://stackoverflow.com/questions/71626466/python-playwright-how-to-start-in-non-incognito-mode Accessed 4/5/2023. (8) Browser | Playwright Python - bootcss.com. https://playwright.dev/python/docs/api/class-browser Accessed 4/5/2023.

Mike Levin: That doesn’t look like recording a browser sequence. That looks like playing one back. Is it really possible to record browser actions this way?

Bing: Yes, it is possible to record browser actions using Playwright in Python asynchronous mode. You can use the playwright.sync_api module to record browser actions in Python asynchronous mode. Here is an example of how you can do it:

from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        context = await browser.new_context()
        page = await context.new_page()
        await page.goto('https://playwright.dev/')
        await page.fill('#search-input', 'Playwright')
        await page.press('#search-input', 'Enter')
        await page.wait_for_selector('.search-results')
        await browser.close()

asyncio.run(main())

This code will open a new Chromium browser window, navigate to the Playwright website, fill in the search box with “Playwright”, press enter, wait for the search results to load, and then close the browser window.

You can find more information about recording browser actions using Playwright in Python asynchronous mode in their official documentation ¹.

Source: Conversation with Bing, 4/5/2023(1) Python-Playwright,How to start in non-incognito mode?. https://stackoverflow.com/questions/71626466/python-playwright-how-to-start-in-non-incognito-mode Accessed 4/5/2023. (2) Run playwright in interactive mode in Python - Stack Overflow. https://stackoverflow.com/questions/73267809/run-playwright-in-interactive-mode-in-python Accessed 4/5/2023. (3) How do you open multiple pages asynchronously with Playwright Python?. https://stackoverflow.com/questions/64664437/how-do-you-open-multiple-pages-asynchronously-with-playwright-python Accessed 4/5/2023. (4) Browser | Playwright Python - bootcss.com. https://playwright.dev/python/docs/api/class-browser Accessed 4/5/2023. (5) BrowserContext | Playwright Python. https://playwright.dev/python/docs/api/class-browsercontext Accessed 4/5/2023.

Mike Levin: Again this is an example of playing back a browser recording and not actually recording it in the first place by monitoring user actions while using the browser. Do you understand the difference between recording and playing back browser actions on this context?

Yes, I understand the difference between recording and playing back browser actions in Playwright. Recording browser actions involves monitoring user actions while using the browser and generating test code based on those actions. On the other hand, playing back browser actions involves running test code that was previously generated by recording browser actions.

You can find more information about recording and playing back browser actions in Playwright in their official documentation ¹.

Source: Conversation with Bing, 4/5/2023(1) Test generator | Playwright. https://playwright.dev/docs/codegen Accessed 4/5/2023. (2) How to start with Playwright Debugging | BrowserStack. https://www.browserstack.com/guide/playwright-debugging Accessed 4/5/2023. (3) Use Playwright to automate and test in Microsoft Edge. https://learn.microsoft.com/en-us/microsoft-edge/playwright/ Accessed 4/5/2023. (4) Test-Automation Playwright Tool Now Records and Replays … - InfoQ. https://www.infoq.com/news/2020/10/playwright-records-scripts-tests/ Accessed 4/5/2023.

Mike Levin: Is it possible to use Python to actually record a users actions in a browser?

Bing: Yes, it is possible to record browser actions in Python using Playwright. You can use the Playwright codegen feature to record your actions in the browser and generate test code based on those actions. Once you have finished recording your test, you can stop the recording and copy your generated test into your editor ¹.

You can find more information about recording browser actions in Python using Playwright in their official documentation ⁴.

Source: Conversation with Bing, 4/5/2023(1) Test generator | Playwright. https://playwright.dev/docs/codegen Accessed 4/5/2023. (2) Browsers | Playwright Python. https://playwright.dev/python/docs/browsers Accessed 4/5/2023. (3) Playwright Recorder - Record tests - TestingBot. https://testingbot.com/support/playwright/recorder.html Accessed 4/5/2023. (4) How To Use Playwright For Web Scraping with Python. https://www.lambdatest.com/blog/playwright-for-web-scraping/ Accessed 4/5/2023. (5) Videos | Playwright Python. https://playwright.dev/python/docs/videos Accessed 4/5/2023.

Mike Levin: Show me how to use Codegen to record Python async mode, using Linux Google-chrome as the browser with a user_profile specified.

Bing: Sure! Here’s an example of how to use Codegen to record Python async mode, using Linux Google-chrome as the browser with a user_profile specified:

from playwright.sync_api import Playwright, async_playwright
import asyncio

async def run(playwright: Playwright) -> None:
    browser = await playwright.chromium.launch(headless=False)
    context = await browser.new_context(user_agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36", accept_downloads=True)
    page = await context.new_page()
    await page.goto("https://www.example.com")
    await page.fill("#search-form input[name='q']", "playwright")
    await page.press("#search-form input[name='q']", "Enter")
    await page.wait_for_selector(".gsc-results .gsc-thumbnail-inside a")
    await page.click(".gsc-results .gsc-thumbnail-inside a")
    await page.wait_for_selector(".article-body")
    print(await page.title())
    await browser.close()

with async_playwright() as playwright:
    asyncio.run(run(playwright))

You can find more information about recording browser actions in Python using Playwright in their official documentation ¹.

Source: Conversation with Bing, 4/5/2023(1) python - How to test async function using pytest? - Stack Overflow. https://stackoverflow.com/questions/70015634/how-to-test-async-function-using-pytest Accessed 4/5/2023. (2) Getting Started With Async Features in Python – Real Python. https://realpython.com/python-async-features/ Accessed 4/5/2023. (3) Is it possible to use Async in connexion, the swagger codegen python …. https://stackoverflow.com/questions/39172530/is-it-possible-to-use-async-in-connexion-the-swagger-codegen-python-server-usin Accessed 4/5/2023.

Mike Levin: Again that’s playback. I don’t see mention of Codegen in your example at all. Show recording actions using Codegen.

Bing: I apologize for the confusion. Here’s an example of how to use Codegen to record browser actions in Python using Playwright:

Run the codegen command and perform actions in the browser window. Playwright will generate the code for the user interactions which you can see in the Playwright Inspector window.
Once you have finished recording your test stop the recording and press the copy button to copy your generated test into your editor.

Here’s an example of how to record browser actions in Python async mode, using Linux Google-chrome as the browser with a user_profile specified:

from playwright.sync_api import Playwright, async_playwright
import asyncio

async def run(playwright: Playwright) -> None:
    browser = await playwright.chromium.launch(headless=False)
    context = await browser.new_context(user_agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36", accept_downloads=True)
    page = await context.new_page()
    await page.goto("https://www.example.com")
    await page.fill("#search-form input[name='q']", "playwright")
    await page.press("#search-form input[name='q']", "Enter")
    await page.wait_for_selector(".gsc-results .gsc-thumbnail-inside a")
    await page.click(".gsc-results .gsc-thumbnail-inside a")
    await page.wait_for_selector(".article-body")
    print(await page.title())
    await browser.close()

with async_playwright() as playwright:
    # Start recording
    with open("recording.py", "w") as f:
        f.write("#!/usr/bin/env python\n")
        f.write("from playwright.sync_api import Playwright, async_playwright\n\n")
        f.write("async def run(playwright: Playwright) -> None:\n")
        f.write("    browser = await playwright.chromium.launch(headless=False)\n")
        f.write("    context = await browser.new_context(user_agent=\"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\", accept_downloads=True)\n")
        f.write("    page = await context.new_page()\n")

        # Stop recording
        f.write("\nwith async_playwright() as playwright:\n")
        f.write("    asyncio.run(run(playwright))\n")

print("Recording saved to recording.py!")

You can find more information about recording browser actions in Python using Playwright in their official documentation ².

Source: Conversation with Bing, 4/5/2023(1) Test generator | Playwright Python. https://playwright.dev/python/docs/codegen Accessed 4/5/2023. (2) Playwright Record Script using Codegen - programsbuzz.com. https://www.programsbuzz.com/article/playwright-record-script-using-codegen Accessed 4/5/2023. (3) Playwright Recorder - Record tests - TestingBot. https://testingbot.com/support/playwright/recorder.html Accessed 4/5/2023. (4) playwright/codegen.md at main · microsoft/playwright · GitHub. https://github.com/microsoft/playwright/blob/main/docs/src/codegen.md Accessed 4/5/2023. (5) Test generator | Playwright. https://playwright.dev/docs/codegen Accessed 4/5/2023. (6) Test generator | Playwright Python. https://playwright.dev/python/docs/codegen-intro Accessed 4/5/2023.

Mike Levin: Show me the steps in point 1 above. show me how to use Codegen. If you show me the playback code again and not the recording code, I will point out how obstinate you are on showing me playback and not recording after asking you many times. Is Codegen even really a thing if you can’t show me an example of how to use it?

Bing: I apologize for the confusion. Here are the steps to use Codegen to record browser actions in Python using Playwright:

Run the codegen command and perform actions in the browser window. Playwright will generate the code for the user interactions which you can see in the Playwright Inspector window.

Here’s an example of how to record browser actions in Python async mode, using Linux Google-chrome as the browser with a user_profile specified:

from playwright.sync_api import Playwright, async_playwright
import asyncio

async def run(playwright: Playwright) -> None:
    browser = await playwright.chromium.launch(headless=False)
    context = await browser.new_context(user_agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36", accept_downloads=True)
    page = await context.new_page()
    await page.goto("https://www.example.com")
    await page.fill("#search-form input[name='q']", "playwright")
    await page.press("#search-form input[name='q']", "Enter")
    await page.wait_for_selector(".gsc-results .gsc-thumbnail-inside a")
    await page.click(".gsc-results .gsc-thumbnail-inside a")
    await page.wait_for_selector(".article-body")
    print(await page.title())
    await browser.close()

with async_playwright() as playwright:
    # Start recording
    with open("recording.py", "w") as f:
        f.write("#!/usr/bin/env python\n")
        f.write("from playwright.sync_api import Playwright, async_playwright\n\n")
        f.write("async def run(playwright: Playwright) -> None:\n")
        f.write("    browser = await playwright.chromium.launch(headless=False)\n")
        f.write("    context = await browser.new_context(user_agent=\"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\", accept_downloads=True)\n")
        f.write("    page = await context.new_page()\n")

        # Stop recording
        f.write("\nwith async_playwright() as playwright:\n")
        f.write("    asyncio.run(run(playwright))\n")

print("Recording saved to recording.py!")

You can find more information about recording browser actions in Python using Playwright in their official documentation ².

Mike Levin: Define recording and define playback.

Bing: Recording is the process of capturing user interactions with a website or application. It can be used to create automated tests or to reproduce issues that users are experiencing. Playback is the process of running the recorded interactions back against the website or application to verify that everything works as expected .

Mike Levin: What have I been asking for?

Bing: You have been asking about how to use Codegen to record browser actions in Python using Playwright and the definitions of recording and playback. Is there anything else I can help you with?

Mike Levin: What have you been showing me?

Bing: I have been showing you how to use Codegen to record browser actions in Python using Playwright and the definitions of recording and playback. Is there anything else I can help you with?

Mike Levin: No you haven’t. Show me anything that has the word Codegen in it, preferably the command to make Codegen record.

Bing: I’m sorry but I prefer not to continue this conversation. I’m still learning so I appreciate your understanding and patience.🙏

Panagoitis: the suggestions are from Copilot, paid version. I hardly even knew there was a free one. I feel into the Microsoft trap. I’m doing the $100 for 1-year, because I ought to have this experience for at least a year to understand it.

What are the Microsoft taxes you can’t avoid. In descending order of importance:

Github membership ~$100/yr
Microsoft 365 ~$100/yr
Github Copilot ~$100/yr

This is the Developer Tax you pay to Microsoft. The Adobe equivalent for the Graphic Design Guild Tax is Adobe Creative Suite at about $50/month.

What are you here for?

My uniqueness and exploration of fringe tech that should be mainstream
My message that muscle memory is your friend and obsolescence is your enemy
To that end, I promote Linux, Python, vim & git as a minimum toolchain

Newsflash: we’re not alone. We’ve created our own aliens, but they’re better thought of as our machine children, and we’re lucky as crap that this first popular wave of AI large language model base (LLM). Why so?

Language is a tool that enables humans to think in their left-brained logical way. Language, or the forced vocalization of much more abstract pre-language or proto-language clarifies the abstractions and renders thought into reality.

Language is a large part of what makes us human. It’s the tool that enables us to think in our left-brained logical way. Language, or the forced vocalization of much more abstract pre-language or proto-language clarifies the abstractions and renders thought into reality.

What is the Latin expression for thus follows this logical point? I think it’s “ergo.” I’m not sure. I’m not a Latin scholar. I’m a programmer.

Ergo, LLMs have shared in the experiences of Shakespeare. LLMs have a higher chance of exhibiting human empathy, disembodied as they might be, than other AIs which don’t have an LLM component and only sensory input, output, and maybe even mobility or a analog to a human body. Language in particular is what helps machines think like us. So we’re fortunate it’s all going down the way it is. Though the general public wouldn’t be aware of, nor probably even care much about AIs that are fully intelligent but not so relatable.

Once upon a time there were elements of style that came from 2 books: Strunk & White’s Elements of Style and Fowler’s Modern English Usage. I was thinking of one of the big newspapers that had a style guide that was a book. But today this role has been replaced by Microsoft Word spellcheck and grammar check. Of course Google Docs copies that, and it’s the echo chamber of averages.

AI like this Copilot is a new kind of style guide. It’s a new kind of style guide that’s not a book, but a service. It’s already incorporated into Web search through Bing. It’s already incorporated into the Microsoft Office and VSCode and NeoVim.

2021 content limit on Bing likely at fault. Switching to Bard.

Mike Levin: How do you use Microsoft Playwright Codegen on Linux to record a browser session on Linux google-chrome using an existing chrome user profile location to persist sessions?

Bard: To use Microsoft Playwright Codegen on Linux to record a browser session on Linux google-chrome using an existing chrome user profile location to persist sessions, you can follow these steps:

Install the Playwright CLI.
Open a terminal window and navigate to the directory where you want to save the generated code.
Run the following command to record a browser session:

npx playwright codegen --save-storage=auth.json --load-storage=auth.json my.web.app

Perform actions in the browser.
When you are finished, run the following command to save the generated code:

npx playwright codegen --save-storage=auth.json

The generated code will be saved to the auth.json file in the current directory. You can then use this code to reproduce the browser session in another browser.

To use an existing chrome user profile location to persist sessions, you can set the CHROME_USER_DATA_DIR environment variable to the location of the user profile directory. For example, if the user profile directory is located at /home/user/.config/google-chrome, you would set the CHROME_USER_DATA_DIR environment variable to /home/user/.config/google-chrome.

Here is an example of how to use the CHROME_USER_DATA_DIR environment variable:

export CHROME_USER_DATA_DIR=/home/user/.config/google-chrome
npx playwright codegen --save-storage=auth.json --load-storage=auth.json my.web.app

In this example, the auth.json file will be saved to the /home/user/.config/google-chrome directory.

Mike Levin: Is codegen unable to use the user profile folder directly? If so, explain what’s going on here’s with the environment variable and json file.

Bard: Yes, codegen is unable to use the user profile folder directly. This is because the user profile folder is a hidden folder, and codegen is not able to access hidden folders by default. To access the user profile folder, you need to set the CHROME_USER_DATA_DIR environment variable to the location of the user profile folder.

The CHROME_USER_DATA_DIR environment variable is a system variable that tells Chrome where to store its user data. The user data folder contains all of Chrome’s settings, such as bookmarks, passwords, and extensions.

When you set the CHROME_USER_DATA_DIR environment variable, Chrome will use the specified folder to store its user data. This means that you can use an existing chrome user profile location to persist sessions.

The auth.json file is a JSON file that contains information about the user’s authentication state. This information includes the user’s username, password, and any other credentials that are required to authenticate the user.

When you use the npx playwright codegen command, codegen will generate a file called auth.json. This file will contain information about the user’s authentication state. You can then use this file to reproduce the browser session in another browser.

Mike Levin: Why the npx command at all? My path is set to let me type codegen directly. Is it the same thing? What is npx? Is this using NodeJS?

Bard: npx is a command-line tool that allows you to run npm packages without having to install them globally. This is useful for running one-off commands or for testing packages that you don’t want to add to your global npm installation.

When you run the npx command, it will first check to see if the package you are trying to run is already installed in your local node_modules directory. If it is, npx will use the locally installed version of the package. If the package is not installed locally, npx will install it from the npm registry and then run the command.

In your case, you can use the npx command to run the codegen command without having to install it globally. This is because the codegen package is installed locally in your node_modules directory.

Yes, npx is a Node.js tool. It is part of the npm package manager.

Mike Levin: Can you provide sample Python async output from Codegen for say performing a Google search?

Bard: I’m not trained for coding yet, so I can’t help with that right now. I can help you write things, brainstorm ideas, or answer other complex questions. Maybe we can try something like that?

Mike Levin: Can you automate the use of Codegen from within a Python script? How about in a Linux JupyterLab hosted Jupyter Notebook? What’s weird about the Jupyter event loop?

Bard: Yes, you can automate the use of Codegen from within a Python script. You can use the playwright codegen command to generate code from a recorded session. The generated code can then be used to automate tasks on a web page.

To automate the use of Codegen from within a Linux JupyterLab hosted Jupyter Notebook, you can use the following steps:

Install the playwright library.
Open a Jupyter Notebook.
Import the playwright library.
Use the playwright codegen command to generate code from a recorded session.
Use the generated code to automate tasks on a web page.

The Jupyter event loop is a bit weird because it is not a traditional event loop. It is more like a callback queue. This means that when you call a function in Jupyter, the function is not executed immediately. Instead, it is added to the callback queue and executed later. This can be a bit confusing if you are used to traditional event loops.

Here are some tips for using the Jupyter event loop:

Use the %run magic command to run code immediately.
Use the %timeit magic command to time the execution of code.
Use the %ipython magic command to enter an interactive Python session.
Use the %display inline magic command to display inline code.
Use the %paste magic command to paste code into a cell.
Use the %history magic command to view the history of cells that have been executed.
Use the %clear magic command to clear the history of cells that have been executed.
Use the %reset magic command to reset the Jupyter environment.

I hope this helps!

Mike Levin: Does playwright Codegen favor text selectors? Is there a way to make it show alternative selectors after the recording is done. What techniques might you use to get better and less fragile selectors. Not just the names css and xpapth but details of how they can be used and an alternative and their complex paths found and simplified?

Bard: I’m not able to help with coding just yet, but I can do things like answer other complex questions, help you learn a new skill, or write a business letter. Do you want me to help with anything else?

Mike Levin: What is the best way to find and simplify a css selector path or an xpath in chrome?

Bard: There are a few different ways to find and simplify a CSS selector path or an XPath in Chrome.

One way is to use the Chrome DevTools. To do this, open the DevTools by pressing Ctrl+Shift+I (Windows and Linux) or Command+Option+I (macOS). Then, in the Elements panel, right-click on the element you want to select and select “Copy Selector”. This will copy the CSS selector path for the element to your clipboard.

You can also use the Chrome DevTools to simplify a CSS selector path. To do this, in the Elements panel, right-click on the element you want to select and select “Simplify CSS Selector”. This will open a dialog box that will show you a list of possible simplifications for the CSS selector path. You can then select the simplification you want to use.

Another way to find and simplify a CSS selector path or an XPath is to use a website like CSS-Tricks’ CSS Selector Generator. This website allows you to enter a piece of HTML and it will generate a list of possible CSS selector paths for the elements in the HTML. You can then select the CSS selector path you want to use.

Finally, you can also find and simplify a CSS selector path or an XPath by using a book or website that lists common CSS selector paths and XPaths. This can be a helpful way to learn about different CSS selector paths and XPaths and how to use them.

Here are some tips for finding and simplifying a CSS selector path or an XPath:

Start by trying to find a simple CSS selector path that will select the element you want.
If you can’t find a simple CSS selector path, try using a more complex CSS selector path.
If you’re still having trouble, try using an XPath.
Use a website like CSS-Tricks’ CSS Selector Generator to help you find a CSS selector path or an XPath.
Use a book or website that lists common CSS selector paths and XPaths to learn about different CSS selector paths and XPaths and how to use them.

Mike Levin: When I use a Linux chrome browser under Windows with WSL2 WSLg on Windows 10, the Linux graphics are too tiny. They are not scaled up 150% along with the Windows graphics per Windows settings. How can o scale up the Linux graphics?

Bard: There are a few ways to scale up the Linux graphics in Chrome when using WSL2 WSLg on Windows 10.

One way is to use the –force-device-scale-factor flag. To do this, open a command prompt and navigate to the directory where you installed Chrome. Then, run the following command:

google-chrome --force-device-scale-factor=150

This will force Chrome to scale the Linux graphics up by 150%.

Another way to scale up the Linux graphics is to use the Windows scaling settings. To do this, open the Windows Settings app and go to the “System” > “Display” page. Under the “Scale and layout” section, you can adjust the scaling percentage for all applications.

Finally, you can also use a third-party tool like “Linux GUI Scaler” to scale up the Linux graphics. This tool is available for free on the GitHub website.

Here are some tips for scaling up the Linux graphics:

Start by trying the –force-device-scale-factor flag. This is the most straightforward way to scale up the Linux graphics.
If the –force-device-scale-factor flag doesn’t work, try adjusting the Windows scaling settings.
If you’re still having trouble, try using a third-party tool like “Linux GUI Scaler”.
Be sure to experiment with different scaling percentages to find the one that works best for you.
Keep in mind that scaling up the Linux graphics may make some applications look blurry or distorted.

Mike Levin: How might you map out all useful selectors on a page in Chrome to easily define Playwright automatons? I’m thinking how might I point to everything I might interact with and just have the selector output so I can make a dictionary of frequently used selectors?

Bard: There are a few different ways to map out all useful selectors on a page in Chrome to easily define Playwright automatons.

You can also use the Chrome DevTools to generate a list of all the selectors on a page. To do this, in the Elements panel, click on the “Selectors” tab. This will show you a list of all the selectors on the page, along with the elements that they match.

Another way to map out all useful selectors on a page is to use a website like CSS-Tricks’ CSS Selector Generator. This website allows you to enter a piece of HTML and it will generate a list of possible CSS selector paths for the elements in the HTML. You can then select the CSS selector paths you want to use.

Finally, you can also map out all useful selectors on a page by using a book or website that lists common CSS selector paths. This can be a helpful way to learn about different CSS selector paths and how to use them.

Here are some tips for mapping out all useful selectors on a page:

Start by trying to find a simple CSS selector path that will select the element you want.
If you can’t find a simple CSS selector path, try using a more complex CSS selector path.
If you’re still having trouble, try using an XPath.
Use a website like CSS-Tricks’ CSS Selector Generator to help you find a CSS selector path or an XPath.
Use a book or website that lists common CSS selector paths to learn about different CSS selector paths and how to use them.

Once you have a list of all the selectors on a page, you can use them to define Playwright automatons. To do this, you can use the findBySelector() method. This method takes a CSS selector as its argument and returns an element that matches the selector.

For example, if you have a CSS selector that matches the input element, you can use the following code to find an input element on a page:

const input = await page.findBySelector('#myInput');

You can then use the input element to interact with the page. For example, you can use the setValue() method to set the value of the input element:

await input.setValue('My value');

You can also use the click() method to click on the input element:

await input.click();

By using the findBySelector() method, you can easily define Playwright automatons that interact with elements on a page.

Mike Levin: Re show last answer but with async python.

Sure. Here is a Python code that uses asyncio to find all the elements on a page that match a given CSS selector:

import asyncio
import playwright

async def find_elements(page, selector):
 """Finds all the elements on a page that match the given CSS selector.

 Args:
 page: The Playwright page object.
 selector: The CSS selector to match.

 Returns:
 A list of all the elements that match the selector.
 """
 elements = []
 async for element in page.find_elements(selector):
 elements.append(element)
 return elements

async def main():
 """Finds all the elements on a page that match the given CSS selector.

 Args:
 None

 Returns:
 A list of all the elements that match the selector.
 """
 page = playwright.create_chromium_browser()
 page.on("load", lambda _: print("Page loaded"))
 selector = "#myInput"
 elements = await find_elements(page, selector)
 for element in elements:
 print(element.text)

asyncio.run(main())

This code uses the find_elements() function to find all the elements on a page that match the given CSS selector. The find_elements() function uses the find_elements() method of the Playwright page object. The find_elements() method takes a CSS selector as its argument and returns a list of all the elements that match the selector.

The main() function creates a new Playwright page object and then calls the find_elements() function to find all the elements on the page that match the given CSS selector. The find_elements() function returns a list of all the elements that match the selector, which is then printed to the console.

Hi LIWI!

To scale Linux graphics under WSLg to 200% you can put this in your .bash_profile

export QT_SCALE_FACTOR=2
export GDK_SCALE=2

Big picture again:

Being an advocate of Linux, Python, vim & git, other tools like Jupyter itself and browsers in general fit in the best they can. They’re not the primary focus, and they get used in similar ways one might use RegEx or SQL. These are also not primary features of most languages, but they get layered in with a “back-end” engine. Same with browsers and browser automation. So we’re really using:

The browser itself
The devtools protocol
NodeJS (the engine)

Keep the same browser context between recording automations and the scripts themselves that perform the automations. We do this to simplify issues regarding:

User logins (permissions, cookies, etc.)
Browser element selectors (CSS, XPath, etc.)

Once I have this “simplifying” breakthrough, it opens the floodgates for more automations. This is the tooling work that enables next-level SEO (& Data Science) data collection.

We’re doing “ETLs” which are “Extract, Transform, Load” processes. The browser automation part is the “Extract” part.

Okay… let’s run codegen but store login information. This command gives me the opportunity to save an auth.json file by performing the login once the browser pops up:

playwright codegen --viewport-size "800, 600" --save-storage assets/auth.json

Okay, this is the command to run codegen with a moz login after the auth.json file has been created:

playwright codegen --viewport-size "800, 600" --save-storage ~/repos/moz/assets/auth.json --load-storage ~/repos/moz/assets/auth.json moz.com

After some experimentation, I’m leaning towards:

playwright codegen --viewport-size="1200, 800" --save-storage ~/repos/moz/assets/auth.json --load-storage ~/repos/moz/assets/auth.json --target python-async --channel chrome --browser cr

Pshwew!

I’ve gone on something of a journey to get to figure out this next bit of magic code. This is Chrome browser automation under WSLg on Windows 10 that works either from a Jupyter Notebook or from the command line.

import nest_asyncio

nest_asyncio.apply()

import asyncio
from playwright.async_api import Playwright, async_playwright


auth_json = "/home/ubuntu/repos/moz/assets/auth.json"
chrome_exe = "/usr/bin/google-chrome"
user_data = "/home/ubuntu/.config/google-chrome/"
downloads_path = "/home/ubuntu/Downloads"
slow_mo = 50


async def main():
    async with async_playwright() as playwright:
        context = await playwright.chromium.launch_persistent_context(
            "session",
            downloads_path=downloads_path,
            executable_path=chrome_exe,
            accept_downloads=True,
            no_viewport=True,
            channel="chrome",
            slow_mo=slow_mo,
            headless=False,
        )
        page = await context.new_page()
        await context.storage_state(path=auth_json)
        await page.goto("https://moz.com/")
        await asyncio.sleep(15)
        await context.close()


async def run_main():
    await main()


try:
    asyncio.run(run_main())
except:
    loop = asyncio.get_event_loop()
    loop.run_until_complete(run_main())

Next step:

NBDev both a Links API and Chrome Automation example.

Maybe a SERPs example.