SSW Website v1 Progress Tracking Repo

As per SSWConsulting/SSW.Website#1263, we need a way to track the progress for which pages have been migrated to the v3 website.

This repository serves as a tracker of which pages have been migrated, we will zz each page that we have migrated to the v3 Next.js site.

We are not zzing pages on the original repo because we want those pages to stay live so we can compare the new pages with them.

v1 site - https://dev.azure.com/ssw2/ssw.website
bypass frontdoor (to see old pages) - https://prod.ssw.com.au/

Updating the migrated pages

Effectively this you can update /archive like a project with an automated deployment for pipeline. There's a GitHub action that automatically syncs the contents of /archive to blob storage. This means all you need to do to update one of the migrated pages is get your PR merged. After your PR is merged with into main you will be able to view your the-updates on the website under ssw.com.au/archive. These changes are synced, meaning that if a file is deleted from the repo it will also be deleted from blob storage.

Important note If you add or delete one of the pages in the repo make sure you will need to update the sitemap. You can do this by running sitemap_generator.py at the base of the repo (SSW.Website-v1-Progress).

Scripts

Prerequisites

Python 3.x - https://www.python.org/downloads/
Latest Chrome Driver - you can download it from here: https://googlechromelabs.github.io/chrome-for-testing/#stable

HTML Output Script

This script will save every page from the v1 website as HTML files in the history folder. It will scan each page and locate any images and save them in the history folder as well, preserving the original path of the images on the v1 site.

Open html_archiver.py file and go to the line of code of code below and make sure you point it to chrome driver (see prerequisites section above).
```
service = Service("C:\\selenium\\chromedriver.exe")
```

Install required modules

pip install requests
pip install selenium
pip install bs4
pip install html5lib

If using MacOS - go to the line of code below and make sure use forward slash / instead of \\
```
split_path = item_path.split("\\")
```
Run the Python script that generates the history folder, run the following command:
```
python html_archiver.py
```

Features

Script + CSS Linking - fix_scripts and fix_css

Modified existing core CSS and JS files exist in the history folder, and will be added to each page if required. Files include:

jquery.js - JQuery
menu.js - Megamenu script
moment.js - Moment.js
ssw_pigeon.js - the most important script to include, the entire JS bundle for most of the v1 sites
ssw_raven_print.css
ssw_raven.css
All contents of the css directory (Base.css, Content.css, etc.) - only for archiving Standards

The script also removes all iframe and script tags to ensure we do not recieve unecessary noise + interaction (e.g. Chatbase, Google Analytics, etc.) when we are viewing these archived pages.

Image Downloading - fix_images

By default, the script will download all images from every page it saves from the v1 site and save them in the history folder. It will save them in the same relative path as the original image on the v1 site e.g. ssw.com.au/ssw/images/Raven/SSWLogo.svg will save in history/images/Raven/SSWLogo.svg.

Image Replacement - fix_images

By updating the IMAGE_REPLACEMENTS dictionary in the html_archiver.py file, you can replace any image on the v1 site with another image. The key (e.g. adam_thumb.jpg) will replace any image URL that ends with the key (e.g. https://www.ssw.com.au/ssw/Standards/Images/adam_thumb.jpg) with a non-broken image (e.g. https://www.ssw.com.au/ssw/Events/Training/Images/adam_thumb.jpg) and then save that image instead of the broken original image.

Link Replacement - fix_links

This function replaces broken links, and links already archived (i.e. have za as the file prefix in the SSW.Website.WebUI folder).

Banner Addition - add_archive_header

This function adds a banner to the top of the page to indicate that the page has been archived.

Figure: The banner added to each page

Index Page Creation - output_index_page

This function generates a table of the pages that have been archived, and the links to the old pages.

Figure: Index page example for /Training

Markdown Todo List

To run the Python script that generates the todos.md file, run the following command:

python todo_outputter_md.py

This will find and list all the aspx files in the project with associated directories, and tick off pages that have zz or za at the start.

Additionally, to generate a todos-notdone.md file that lists all the pages that are not yet migrated, run the following command:

python todo_outputter_md.py --notdone-only

CSV Todo List

To run the Python script that generates the todos.csv file, run the following command:

python todo_outputter_csv.py

This will output a file called todos.csv that can be exported to Excel if required. The status of a page is determined by the prefix of the page name (more detail on prefixes below).

Blob Storage Upload Script

All migrated files in your branch will automatically be pushed to blob storage when your PR is merged with main. Alternatively you can push the files to blob storage by triggering the workflow manually here.

Tasks

Turn off the v1 website's server
Remove this repo once all pages have been migrated to v3

Prefix Meaning

zz - migrated to v3 or contains no content
zr - redirects to another page
za - migrated to archive

CSV Headings

Page - URL of the page
Status - Either Done, Archived, Redirect or To-do
Priority - value in powers of 2 e.g. 1, 2, 4, etc. - higher no. = higher priority
Complexity - value in powers of 2 e.g. 1, 2, 4, etc. - higher no. = higher complexity
Notes - additional info if required

Running the Archive script

The video below explains how to archive pages using the script and submit changes to the site

Name		Name	Last commit message	Last commit date
Latest commit History 566 Commits
.github/workflows		.github/workflows
.vscode		.vscode
Redirects-Scripts		Redirects-Scripts
SSW.Website.WebUI		SSW.Website.WebUI
__pycache__		__pycache__
_docs		_docs
archive		archive
images		images
.gitignore		.gitignore
README.md		README.md
download_resource_files.py		download_resource_files.py
generate_index_pages.py		generate_index_pages.py
html_archiver.py		html_archiver.py
html_migrator.py		html_migrator.py
package-lock.json		package-lock.json
package.json		package.json
remove_za.py		remove_za.py
replace_urls_absolute.py		replace_urls_absolute.py
sitemap_generator.py		sitemap_generator.py
t_redirects.txt		t_redirects.txt
todo_outputter_csv.py		todo_outputter_csv.py
todo_outputter_md.py		todo_outputter_md.py
todos-notdone.md		todos-notdone.md
todos.csv		todos.csv
todos.md		todos.md
upload-blob.cmd		upload-blob.cmd
upload-blob.sh		upload-blob.sh
workspace.code-workspace		workspace.code-workspace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSW Website v1 Progress Tracking Repo

Updating the migrated pages

Scripts

Prerequisites

HTML Output Script

Features

Markdown Todo List

CSV Todo List

Blob Storage Upload Script

Tasks

Prefix Meaning

CSV Headings

Running the Archive script

About

Contributors 9

Languages

SSWConsulting/SSW.Website-v1-archive

Folders and files

Latest commit

History

Repository files navigation

SSW Website v1 Progress Tracking Repo

Updating the migrated pages

Scripts

Prerequisites

HTML Output Script

Features

Markdown Todo List

CSV Todo List

Blob Storage Upload Script

Tasks

Prefix Meaning

CSV Headings

Running the Archive script

About

Resources

Stars

Watchers

Forks

Contributors 9

Languages