With this Utility you can update the pages of a cache archive. The links to pages already archived are followed up by one or several start pages. Changed pages are stored in an update archive.
Start of Utility
You start the update with one of the following script files in the folder: MM3-WebAssistantProfessional/script/
Script | Operation System |
---|---|
MM3-Utility.bat | Windows of Microsoft |
MM3-Utility.sh | Linux and UNIX |
MM3-Utility.command | Mac OS X of Apple |
In the first dialog all utilities are displayed.
Select: Update from archived pages.
With Next you get the configuration dialog: Update
You can save a updateing configuration in a Set. At first use the set default. With New you create a new set. Furthermore you can Rename or WA_DeleteButton] a set with a pop-up menu.
General
Start Page
The links to pages already archived are followed up by one or several start pages. Changed pages are stored in an update archive. Enter the URL for every start page in a new line.
Note
There are restrictions at the link following of dynamic elements like Java-Skript and Flash.
Surf Set and Cache Archiv
The parameters of the Surf Set are used with the exception of Marker and Prefetch. Only that cache archive is updated which have the write access in the select surf set. The pages are updated by following of links in the cache archive. If the surf set only consists of one archive, then the links which are marked green are followed up (see Surf Set/Marker).
Update Archiv
The updated pages and their new resources will be archived in this archive. The name of the update archive corresponds to the name of the cache archive to be updated with a name addition. The todays' date is per default used as an addition, this is indicated by a character pattern. The character pattern starts with the opening bracket [ and ends with a closening bracket ].
Character | Description | Example |
---|---|---|
yy | two-digit number for the year | 06 |
MM | two-digit number for the month | 03 |
dd | two-digit number for the day | 21 |
HH | two-digit number for the hour | 14 |
mm | two-digit number for the minutes | 36 |
ww | two-digit number for the calendar week | 40 |
You can check the number of pages to be updated before the storage. To this you deactivate the update archive. With this result you can determine additional necessary start pages or you can indicate restrictions for the link following.
Note
Pages only are archived if the update archive is activated.
Reuse
The Utility MM3-Update updates all files which are older than the adjusted time of reuse.
You can set the reuse as relative or absolute time.
Examples are indicated in the two following tables.
Unit | Description | Example |
---|---|---|
m | Minute | 62m |
h | Hour | 10h |
d | Day | 3d |
w | Week | 2w |
M | Months | 6M |
y | Year | 1y |
Format | Description | Example |
---|---|---|
yyyy.MM.dd | Date Year.Month.Day | 2008.08.19 |
hh:mm | Time Hour:Minute | 10:30 |
yyyy.MM.dd hh:mm | Date and Time | 2008.08.19 10:30 |
Note
If you don't set the date completely, the todays' date and the time of day is used.
Format | Description |
---|---|
..1 00:00 | first day of the current month |
Waiting time
Some servers break off downloading files automatically. The files are requested in much faster consequence as this is carried out at the surfing. You therefore insert a waiting time, at which you let best vary these between a minimal and maximum waiting time.
Log
All links followed up are logged per default.
Log with | In a HTML page … |
---|---|
Contained resources | all contained resources |
Not followed links | all not followed links |
In the summary you can log links not followed in addition.
Note
Pages and resource occurring repeatedly are logged only once.
Following Links
If too many pages are downloaded at a mirroring, you can prevent this with a filter so. You can define in a filter: domain, directory, file name, file or MIME types. With the additional parameter Follow the effect of the filter is controlled by the link following.
Follow | Following or excluding link |
---|---|
Yes | Follow: If the link includes the character pattern of the filter. |
No | Exclude: If the link includes the character pattern of the filter. |
--- | Disable: Filter isn't use. |
If both, Follow and Exclude, meet a link, then the link will be excluded and the page won't be downloaded.
The detailed structure of the filter is described: Filter for Following Links
Follow | Domain | Path | File | Typ |
---|---|---|---|---|
Yes | / | |||
No | / | pdf application/pdf | ||
No | /private/ |
Only pages are downloaded from the domain Proxy-Offline-Browser.com, however no pdf documents and pages from the folder /private/ and its subdirectories.
Note
Please select, whether the link has to be followed, excluded or enquired generally.
With Enquire, you can indicated when updating interactively whether is a link to follow or to exclude.
With Generate filters you can generate the required filters from your start pages (URLs). An additional pattern still can specify the filters or you can manually or with interaction postprocess the filters.
Log
All locally available pages which are excluded by the filter are logged. You hereby can check the setting of the filter.
Cookie
Cookies can be accepted or blocked depending on filter setting. This supports the protection of data privacy. You also can prevent the representation of history (breadcrumb trail) on a HTML page with that, however.
Accept | Cookie accept or block |
---|---|
Yes | Accept: If the URL includes the character pattern of the filter. |
No | Block: If the URL includes the character pattern of the filter. |
--- | Disable: Filter isn't use. |
If both, Accept and Bock, meet a cookie, then the cookie will be blocked.
The detailed structure of the filter is described: Filter for Cookie
Accept | Domain | Path |
---|---|---|
No | / |
All Cookies are blocked.
Log
All accepted cookies are logged. You hereby can check the setting of the filter.
Authentication
For a password protected web site the user name and the password can be saved.
Login | Access data for a web site |
---|---|
Yes | Login: Logging in for this URL with the access data. |
No | Block: Not logging in for this URL any more. |
--- | Disable: Filter isn't use. |
You can in addition indicate whether the password shall be saved at the call of a protected page.
Passwords can be protected with a master password. The master password has to be entered at every start of the Utility Update.
Note
Only logins can be processed if the login is requested over the http protocol header (registration with dialog box of the browser). Logins over a HTML-form aren't supported.
Start of the update
After you have filled out the required configuration or selected a set you start the update with Start Update. In a dialog the updating is logged. A summary is displayed at the end of the updateing or at the cancel. A report can be displayed in the browser.
Display of the reports update archive
You reach the reports on the local sitemap (or direct by URL: http://127.0.0.1:8080/update/). You select the desired report from a list. A report includes all changed pages of an update. Passages joined newly in a page are displayed.
You can configure the display as follows:
- Number of changed pages per report page.
- Number of signs which shall be displayed at a changed passage.
- Sorting sequence of the changed pages.
You can surf in a report. At the surfing the changes are shown to you. (Per default blue highlighted). At the online surfing all pages are stored in the update archives (not in the cache archiv).
You recognize this mode at the update symbol instead of the surf set symbol . If you click on the update symbol, then the update mode is finished. You reach the update mode at the call of a report. The called report determines the update archive and the surf set. If several reports are displayed at the same time, the report loaded last determines the update archive and the surf set. The used surf set is displayed at the main menu and the tool tip indicates the update archive.
Taking
With a click on the symbol for the synchronization all resources are copied into the corresponding cache archive. The copied domains are logged. After this you can delete the report and the update archive.