Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
agis committed Jun 26, 2020
0 parents commit 9e386d9
Show file tree
Hide file tree
Showing 14 changed files with 900 additions and 0 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Changelog

## master (unreleased)

20 changes: 20 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
The MIT License

Copyright (c) 2020 Skroutz S.A.

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
103 changes: 103 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# RSpecQ

RSpecQ (`rspecq`) distributes and executes an RSpec suite over many workers,
using a centralized queue backed by Redis.

RSpecQ is heavily inspired by [test-queue](https://github.com/tmm1/test-queue)
and [ci-queue](https://github.com/Shopify/ci-queue).

## Why don't you just use ci-queue?

While evaluating ci-queue for our RSpec suite, we observed slow boot times
in the workers (up to 3 minutes), increased memory consumption and too much
disk I/O on boot. This is due to the fact that a worker in ci-queue has to
load every spec file on boot. This can be problematic for applications with
a large number of spec files.

RSpecQ works with spec files as its unit of work (as opposed to ci-queue which
works with individual examples). This means that an RSpecQ worker does not
have to load all spec files at once and so it doesn't have the aforementioned
problems. It also allows suites to keep using `before(:all)` hooks
(which ci-queue explicitly rejects). (Note: RSpecQ also schedules individual
examples, but only when this is deemed necessary, see section
"Spec file splitting").

We also observed faster build times by scheduling spec files instead of
individual examples, due to way less Redis operations.

The downside of this design is that it's more complicated, since the scheduling
of spec files happens based on timings calculated from previous runs. This
means that RSpecQ maintains a key with the timing of each job and updates it
on every run (if the `--timings` option was used). Also, RSpecQ has a "slow
file threshold" which, currently has to be set manually (but this can be
improved).

*Update*: ci-queue deprecated support for RSpec, so there's that.

## Usage

Each worker needs to know the build it will participate in, its name and where
Redis is located. To start a worker:

```shell
$ rspecq --build-id=foo --worker-id=worker1 --redis=redis://localhost
```

To view the progress of the build print use `--report`:

```shell
$ rspecq --build-id=foo --worker-id=reporter --redis=redis://localhost --report
```

For detailed info use `--help`.


## How it works

The basic idea is identical to ci-queue so please refer to its README

### Terminology

- Job: the smallest unit of work, which is usually a spec file
(e.g. `./spec/models/foo_spec.rb`) but can also be an individual example
(e.g. `./spec/models/foo_spec.rb[1:2:1]`) if the file is too slow
- Queue: a collection of Redis-backed structures that hold all the necessary
information for RSpecQ to function. This includes timing statistics, jobs to
be executed, the failure reports, requeueing statistics and more.
- Worker: a process that, given a build id, pops up jobs of that build and
executes them using RSpec
- Reporter: a process that, given a build id, waits for the build to finish
and prints the summary report (examples executed, build result, failures etc.)

### Spec file splitting

Very slow files may put a limit to how fast the suite can execute. For example,
a worker may spend 10 minutes running a single slow file, while all the other
workers finish after 8 minutes. To overcome this issue, rspecq splits
files that their execution time is above a certain threshold
(set with the `--file-split-threshold` option) and will instead schedule them as
individual examples.

In the future, we'd like for the slow threshold to be calculated and set
dynamically.

### Requeues

As a mitigation measure for flaky tests, if an example fails it will be put
back to the queue to be picked up by
another worker. This will be repeated up to a certain number of times before,
after which the example will be considered a legit failure and will be printed
in the final report (`--report`).

### Worker failures

Workers emit a timestamp after each example, as a heartbeat, to denote
that they're fine and performing jobs. If a worker hasn't reported for
a given amount of time (see `WORKER_LIVENESS_SEC`) it is considered dead
and the job it reserved will be requeued, so that it is picked up by another worker.

This protects us against unrecoverable worker failures (e.g. segfault).

## License

RSpecQ is licensed under MIT. See [LICENSE](LICENSE).
67 changes: 67 additions & 0 deletions bin/rspecq
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#!/usr/bin/env ruby
require "optionparser"
require "rspecq"

opts = {}
OptionParser.new do |o|
o.banner = "Usage: #{$PROGRAM_NAME} [opts] [files_or_directories_to_run]"

o.on("--build-id ID", "A unique identifier denoting the build") do |v|
opts[:build_id] = v
end

o.on("--worker-id ID", "A unique identifier denoting the worker") do |v|
opts[:worker_id] = v
end

o.on("--redis HOST", "Redis HOST to connect to (default: 127.0.0.1)") do |v|
opts[:redis_host] = v || "127.0.0.1"
end

o.on("--timings", "Populate global job timings in Redis") do |v|
opts[:timings] = v
end

o.on("--file-split-threshold N", "Split spec files slower than N sec. and " \
"schedule them by example (default: 999999)") do |v|
opts[:file_split_threshold] = Float(v)
end

o.on("--report", "Do not execute tests but wait until queue is empty and " \
"print a report") do |v|
opts[:report] = v
end

o.on("--report-timeout N", Integer, "Fail if queue is not empty after " \
"N seconds. Only applicable if --report is enabled " \
"(default: 3600)") do |v|
opts[:report_timeout] = v
end

end.parse!

[:build_id, :worker_id].each do |o|
raise OptionParser::MissingArgument.new(o) if opts[o].nil?
end

if opts[:report]
reporter = RSpecQ::Reporter.new(
build_id: opts[:build_id],
worker_id: opts[:worker_id],
timeout: opts[:report_timeout] || 3600,
redis_host: opts[:redis_host],
)

reporter.report
else
worker = RSpecQ::Worker.new(
build_id: opts[:build_id],
worker_id: opts[:worker_id],
redis_host: opts[:redis_host],
files_or_dirs_to_run: ARGV[0] || "spec",
)

worker.populate_timings = opts[:timings]
worker.file_split_threshold = opts[:file_split_threshold] || 999999
worker.work
end
21 changes: 21 additions & 0 deletions lib/rspecq.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
require "rspec/core"

module RSpecQ
MAX_REQUEUES = 3

# If a worker haven't executed an RSpec example for more than this time
# (in seconds), it is considered dead and its reserved work will be put back
# to the queue, to be picked up by another worker.
WORKER_LIVENESS_SEC = 60.0
end

require_relative "rspecq/formatters/example_count_recorder"
require_relative "rspecq/formatters/failure_recorder"
require_relative "rspecq/formatters/job_timing_recorder"
require_relative "rspecq/formatters/worker_heartbeat_recorder"

require_relative "rspecq/queue"
require_relative "rspecq/reporter"
require_relative "rspecq/worker"

require_relative "rspecq/version"
15 changes: 15 additions & 0 deletions lib/rspecq/formatters/example_count_recorder.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
module RSpecQ
module Formatters
# Increments the example counter after each job.
class ExampleCountRecorder
def initialize(queue)
@queue = queue
end

def dump_summary(summary)
n = summary.examples.count
@queue.increment_example_count(n) if n > 0
end
end
end
end
50 changes: 50 additions & 0 deletions lib/rspecq/formatters/failure_recorder.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
module RSpecQ
module Formatters
class FailureRecorder
def initialize(queue, job)
@queue = queue
@job = job
@colorizer = RSpec::Core::Formatters::ConsoleCodes
@non_example_error_recorded = false
end

# Here we're notified about errors occuring outside of examples.
#
# NOTE: Upon such an error, RSpec emits multiple notifications but we only
# want the _first_, which is the one that contains the error backtrace.
# That's why have to keep track of whether we've already received the
# needed notification and act accordingly.
def message(n)
if RSpec.world.non_example_failure && !@non_example_error_recorded
@queue.record_non_example_error(@job, n.message)
@non_example_error_recorded = true
end
end

def example_failed(notification)
example = notification.example

if @queue.requeue_job(example.id, MAX_REQUEUES)
# HACK: try to avoid picking the job we just requeued; we want it
# to be picked up by a different worker
sleep 0.5
return
end

presenter = RSpec::Core::Formatters::ExceptionPresenter.new(
example.exception, example)

msg = presenter.fully_formatted(nil, @colorizer)
msg << "\n"
msg << @colorizer.wrap(
"bin/rspec #{example.location_rerun_argument}",
RSpec.configuration.failure_color)

msg << @colorizer.wrap(
" # #{example.full_description}", RSpec.configuration.detail_color)

@queue.record_example_failure(notification.example.id, msg)
end
end
end
end
14 changes: 14 additions & 0 deletions lib/rspecq/formatters/job_timing_recorder.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
module RSpecQ
module Formatters
class JobTimingRecorder
def initialize(queue, job)
@queue = queue
@job = job
end

def dump_summary(summary)
@queue.record_timing(@job, Float(summary.duration))
end
end
end
end
17 changes: 17 additions & 0 deletions lib/rspecq/formatters/worker_heartbeat_recorder.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
module RSpecQ
module Formatters
# Updates the respective heartbeat key of the worker after each example.
#
# Refer to the documentation of WORKER_LIVENESS_SEC for more info.
class WorkerHeartbeatRecorder
def initialize(worker)
@worker = worker
end

def example_finished(*)
@worker.update_heartbeat
end
end
end
end

Loading

0 comments on commit 9e386d9

Please sign in to comment.