Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance metha-sync with delay arg. #25

Merged
merged 1 commit into from
Oct 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions cmd/metha-sync/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ var (
baseDir = flag.String("base-dir", metha.GetBaseDir(), "base dir for harvested files")
hourly = flag.Bool("hourly", false, "use hourly intervals for harvesting")
daily = flag.Bool("daily", false, "use daily intervals for harvesting")
delay = flag.Int("delay", 0, "sleep between each OAI-PMH request")
disableSelectiveHarvesting = flag.Bool("no-intervals", false, "harvest in one go, for funny endpoints")
endpointList = flag.Bool("list", false, "list a selection of OAI endpoints (might be outdated)")
format = flag.String("format", "oai_dc", "metadata format")
Expand Down Expand Up @@ -118,6 +119,7 @@ func main() {
harvest.HourlyInterval = *hourly
harvest.DailyInterval = *daily
harvest.ExtraHeaders = extra
harvest.Delay = *delay
log.Printf("harvest: %+v", harvest)
if *removeCached {
log.Printf("removing already cached files from %s", harvest.Dir())
Expand Down
6 changes: 6 additions & 0 deletions harvest.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ type Harvest struct {
DailyInterval bool
ExtraHeaders http.Header

Delay int

// XXX: Lazy via sync.Once?
Identify *Identify
Started time.Time
Expand Down Expand Up @@ -335,6 +337,10 @@ func (h *Harvest) runInterval(iv Interval) error {
req.From = iv.Begin.Format(h.DateLayout())
req.Until = iv.End.Format(h.DateLayout())
}

if h.Delay > 0 {
time.Sleep(time.Duration(h.Delay) * time.Second)
}
// Do request, return any http error, except when we ignore HTTPErrors - in that case, break out early.
resp, err := Do(&req)
if err != nil {
Expand Down