I started by trying to create an import script that was build into a normal page action, you would go to a page ie:
The problem with this is that downloading and importing a feed takes a long time and so the page will appear unresponsive for that time, the loading animation just turns and turns.
If you mistakenly click on the refresh button, the initial process will not close and so you will start to get duplicate processes which will mess up the data and increase the server load.
Ideally while downloading and importing the data there would be a progress bar so that you can see what's happening and any other useful feedback if things go wrong.
To get this kind of feedback you need to be able to update the browser as the process is running.
You also want the syncing to happen directly between servers rather than server to local computer to server as this would be faster in instances where both servers are remote and the internet access to the local computer is relatively slow, - i.e you want the import process handled in php which is server side (you could use another server side language if you wanted).
The way to get feedback about what's happening while having the import process being handled in php is to have two separate scripts one that handles the importing and another that can be refreshed regularly serves the gui webpage.
We can extend the idea of a separate import script to sort the problem of duplicate processes too by implementing a download and import queue.
When a user clicks to download/sync a feed in the sync page, the action behind this enters the details of the feed to be downloaded in the queue, it can also check at this point that it does not already exist in the queue.
We can then run the import script independently of the gui page, such as via a scheduled cron job that runs the script one a minute. We can put a check in there to ensure that only one instance of the script is running at any given time too.
The import script then checks if there are feeds to be downloaded in the import queue, working through each queue item sequentially.
The added advantage of the queue implementation is that we can keep the load that importing places on the server low allowing only one feed to be downloaded and imported at any given time.
So all in all this implementation:
- Enables feedback on import progress.
- Ensures we don’t get duplicate feed syncing processes.
- Ensures that the server load is kept under control.
On to the code, here's the import script that works through the import queue and downloads and imports each item sequentially, I would be interested to hear if there are any ways the efficiency of the export and import process can be improved.
In the next post I will explain the interface for generating the feed import queue. To engage in discussion regarding this post, please post on our Community Forum.