Cleanup ISLE git repository for Release v.1.4.0
As a result of the ISLE Phase II Sprint F - Optimize Docker Images in Fall 2019, the ISLE repository has had several changes:
-
Original ISLE git repo size was ~182 MB and would take a minimum of 10 -15 seconds or more to clone (depending on one's Internet connection)
-
The cleaned up ISLE git repo size is now 6.6 MB and should take a minimum of less than 3 seconds to clone (depending on one's Internet connection)
-
a new clean up script called isle-v140-git-cleanup-script.sh has been added along with this documentation created for end users to run on their ISLE repository.
-
Any new pull request will have to be created from a forked or cloned ISLE repository that has had this cleanup process performed on it.
AS OF THE ISLE v.1.4.0 release in November 2019 (MANDATORY)
Due to removal of older files and folders and git changes in the ISLE git repository as of November 25, 2019
, please review the following:
-
If you are a new ISLE user and are cloning or forking this project for the first time after
November 25, 2019
, then you can ignore this documentation and move on to the Quick Start section within the README.md as needed. -
If you have previously forked or cloned the ISLE project prior to
November 25, 2019
, then you do need to follow the instructions given in this documentation. -
If you still feel like you're not sure or cannot tell if you need to do this, then please refer to the instructions given in the
Identification Tools
section below within this documentation. -
If you intend to submit pull requests (PRs) of any kind (documentation, code etc) to the ISLE project, ISLE maintainers will NOT accept PRs from repos that haven't had the script run on them as using an uncleaned forked repo will cause the removed files to return.
-
Please run this script and process PRIOR to attempting to follow the normal ISLE update procedure for version
1.4.0
.
There is a brief description of what the script actually does found below in the Script commands - explained section.
Git cleanup process
Git cleanup process - Overview
The steps below in the section Git cleanup process
will have you perform the following:
- Open up a terminal and perform steps as directed
- Make a directory called
ISLE_140_GIT_CLEANUP
on your laptop or workstation to perform all of these steps and collect all materials - Download the git cleanup script (
isle-v140-git-cleanup-script.sh
)and move it to theISLE_140_GIT_CLEANUP
directory git clone --mirror
your forked or cloned ISLE project directory to theISLE_140_GIT_CLEANUP
directory- Run the git cleanup script. It will prompt you to only enter the name of your forked or cloned ISLE project directory.
git push
back the changes to your remote forked or cloned ISLE project directory.- Clone down your newly changed and smaller forked or cloned ISLE project directory again to ensure that the repositiory is smaller to a location of your choice.
- Remove all previous larger size versions of your forked or cloned ISLE project directory. You can now delete the
ISLE_140_GIT_CLEANUP
directory and its contents. - Work from the newly resized and smaller forked or cloned ISLE project directory.
- As needed re-clone and redeploy projects on locals, staging and production servers.
Git cleanup process - Steps
-
Open up a terminal
-
Change directory to a location of your choice
cd your_path_here
-
Make a new temporary direcotory called
ISLE_140_GIT_CLEANUP
mkdir ISLE_140_GIT_CLEANUP
-
Download the script found and save it to the
ISLE_140_GIT_CLEANUP
directory and fix the permissions.-
wget https://raw.githubusercontent.com/Islandora-Collaboration-Group/ISLE/master/docs/assets/isle-v140-git-cleanup-script.sh
-
chmod 755 isle-v140-git-cleanup-script.sh
-
-
Run this git command to clone your forked or cloned ISLE project directory. This is not the usual
git clone
command, note the--mirror
flag.-
git clone --mirror git project url
-
Example:
-
git clone --mirror git@github.com:username/ISLE.git
(as a fork using the same project name or title) -
git clone --mirror git@github.com:username/yourprojectnamehere-isle.git
(as a fork or cloned repo using a different project name or title) -
This will create a new directory called
ISLE.git
oryourprojectnamehere-isle.git
which is not typical clone or fork output.
-
-
Run the cleanup script
./isle-v140-git-cleanup-script.sh
-
At the start of the script, there will be an interactive input prompt asking you to:
"Type in the name of your git repo only e.g. ISLE (do not include .git) and then press the [ENTER] or [RETURN] key to continue"`
Enter the name here:
-
Enter the name of your forked or cloned ISLE git repository without the .git extension e.g
ISLE
oryourprojectnamehere-isle
etc. -
The script will continue and the process takes about 30 - 45 seconds depending on the speed of your system with minimal output.
-
Your final step is to push this local smaller repo back to the remote repository
-
cd ISLE.git
orcd yourprojectnamehere-isle.git
-
git push
-
-
Clone down your newly changed and smaller forked or cloned ISLE project directory again to ensure that the repositiory is smaller to a location of your choice.
-
Remove all previous versions of your cloned ISLE project directory.
-
You can now delete the
ISLE_140_GIT_CLEANUP
directory and its contents. -
Work from a newly cloned
ISLE
oryourprojectnamehere-isle
project directory and continue with the ISLE update process to1.4.0
(if/as needed)
Git cleanup process - Example redeploy process
-
To re-deploy on your local
- Assumptions: - You'll have already run the git cleanup steps and script on your forked ISLE git project.
- cd into your the parent directory that contains your
yourprojectnamehere-isle
directory e.g.~/Code/
,Documents
,Projects
, orSites
. - This will be the directory you chose to locate your
yourprojectnamehere-isle
directory. sudo mkdir isle-archive
mv yourprojectnamehere-isle isle-archive/
git clone yourprojectnamehere-isle project
- If you had any untracked files, you'll need to copy them back into place from
isle-archive/yourprojectnamehere-isle
to the newyourprojectnamehere-isle
docker-compose pull
(just incase you didn't have any of the new images)- If upgrading to ISLE
1.4.0
, then follow the usual ISLE update steps
-
To re-deploy on your staging & production server(s)
- Assumptions:
- You'll have already run the git cleanup steps and script on your forked ISLE git project.
- You have already re-cloned your local
-
You have already followed the usual ISLE update steps for version
1.4.0
-
Start with
Staging
first - ssh into your
Staging
ISLE host server cd /opt/
sudo mkdir isle-archive
cd /opt/yourprojectnamehere-isle
docker-compose down
sudo chown -Rv islandora:islandora isle-archive
sudo mv /opt/yourprojectnamehere-isle /opt/isle-archive/
git clone yourprojectnamehere-isle project to /opt/
- If you had any untracked files like
acme.json
, you'll need to copy them back into place from/opt/isle-archive/yourprojectnamehere-isle
- For example:
cp /opt/isle-archive/yourprojectnamehere-isle/config/proxy/acme.json /opt/yourprojectnamehere-isle/config/proxy/acme.json
- For example:
- Edit your .env for the correct environment and commit the change but do not push back.
docker-compose pull
(just incase you didn't have any of the new images)-
docker-compose up -d
to start up the containers -
Repeat these steps above on the
Production
server
Identification tools
- Please note: In some of the examples given below, the title and/or names of the git repository examples use ISLE; your forked or cloned repo may be using a different project name or title, so please replace the
ISLE
for the name of your forked or cloned ISLE project git repository instead e.g.yourprojectnamehere-isle
.
If you would like to check if the files have been removed or are still present:
- Clone this repo to pinpoint offending files and folders to the parent directory of where your local forked or cloned ISLE git repository is located. Do not clone this tool into your local forked or cloned ISLE git repository.
git clone https://github.com/ivantikal/git-tools.git
- Your parent directory should now look like this example below: (again please note your local forked or cloned ISLE git repository might not use the ISLE as the title)
directory/
├── git-tools
└── ISLE
-
Run the tools against your existing local forked or cloned ISLE git repository.
- Example usage for finding the 50 biggest files or folders in your local forked or cloned ISLE git repository. You can swap out the
ISLE
for the name of your forked or cloned ISLE project git repository e.g.yourprojectnamehere-isle
./git-tools/clean-binaries/get_biggest_files_in_history.sh -r ./ISLE/ -n 50
- Example usage for finding the 50 biggest files or folders in your local forked or cloned ISLE git repository. You can swap out the
-
Review the resulting file to see if any of the files and folders listed in the
Files & Folder removed
section appear.cat ./git-tools/clean-binaries/get_biggest_files_in_history.sh.tmp/bigtosmall.txt
-
If none of the files appear, you don't need to run the script.
-
If any of the files or folders appear, you have to run the script. Scroll back up to the top of this document and follow the steps as provided.
-
More information can be found here on to use this tool
Files & Folder removed
- Over 176 MB of errant blobs, binaries and code were identified and removed.
Files removed:
dg.localdomain.key
dg.localdomain.pem
docker-compose.DG-FEDORA.yml
fcrepo-drupalauthfilter-3.8.1.jar
fcrepo-installer-3.8.1.jar
fedoragsearch-2.8.1.zip
log4j-over-slf4j-1.6.6.jar
search_index.json
sitemap.xml
sitemap.xml.gz
slf4j-jdk14-1.6.6.jar
Folders (and files within) removed:
drupal
solr-4.10.4
Script commands - explained
You are welcome to open up the isle-v140-git-cleanup-script.sh
in a text editor to review its contents prior to running.
Essentially isle-v140-git-cleanup-script.sh
will perform the following:
-
Download the BFG-repo-cleaner jar file
bfg-1.113.0.jar
-
Run the appropriate BFG java commands to delete files and folders
-
Run these git commands on your forked or cloned ISLE git repo
git reflog expire --expire=now --all && git gc --prune=now --aggressive
-
Please note: This script will not push back to your source repo. That you need to do manually.