Sunday, April 9, 2023

Some AboveTopSecret UFO threads archived as PDFs (first practical use of ChatGPT/AI for a UFO project?)

I've previously archived entire copies of some defunct UFO discussions forums.  I've been reluctant to download the entirety of some very large discussion forums, such as AboveTopSecret.com, as they are still functioning and are very large.

However, ATS has gradually been becoming less functional over the last few years, with numerous problems developing and not being addressed. When ATS was sold a while ago, the owners raised concern that ATS might have to shut down due to increased maintenance costs. The new owners seem to have avoided incurring costs to address those maintenance concerns, so I am concerned that ATS may suddenly cease to exist.

I posted quite a lot of material on ATS over several years and would prefer that material is not lost. So, I have developed some code (with the help of some other ATS users (particularly "drewlander") and ChatGPT) to archive selected ATS threads as PDFs. I have archived about 70 of my own threads and may archive the threads of a few other ATS oldtimers if they give their permission.  (I have asked on ATS a few times whether the moderators have any objection to this plan and no objections have been raised, and one moderator has helped develop the relevant code).



I think this mini-project is the first time ChatGPT (or, indeed, any chatbot or Artificial Intelligence) has been utilised to help with the archiving of UFO material. I used ChatGPT to help debug an early draft of relevant code, dealing with coding problems far faster than I could have done so on my own.  (I am currently working on another bit of software to archive other material, in relation to which ChatGPT has generated _all_ of the code, not just helped with debugging).

The relevant code (as improved by ChatGPT) is:


@echo off

setlocal enabledelayedexpansion


for /f "usebackq tokens=1-3 delims=," %%A in ("thread_details.csv") do (

  echo %%A %%B %%C 

  rem %%A = thread number, %%B = number of final page, %%C = thread name

  timeout 2

  for /l %%a in (1,1,%%B) do (

    set "n=00%%a"

set "n=!n:~-3!"

    "C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe" https://www.abovetopsecret.com/forum/thread%%A/pg%%a thread%%A_!n!.pdf

  )

  pdftk "thread%%A_*.pdf" cat output "ATS thread%%A - %%C.pdf" 

  del thread*.pdf

)

This code iterates through a list saved in a file called "thread_details.csv". That file can be used to store a list of relevant threads, in the format:

THREADNUMBER,NUMBER OF PAGES,BRIEF TITLE

e.g. :

822773,9,Karl12 - Chronological UFO Thread Directory

I have been creating relevant lists using a simple three-column spreadsheet, and then saving the file as a CSV file:



(Note - no spaces between the three elements).

If someone provides list of their threads in that format, it would involve very little work for me to archive dozens of their threads.