Discussion:
[libtorrent] Saleable thread pool
Steven Siloti
2016-05-08 18:24:56 UTC
Permalink
As a first step towards implementing the new memory mapped I/O[1] design
I'm going to start working on a scaleable thread pool. It'll be a
separate class from disk_io_thread which will also hold the job queue
and related synchronization primitives so that we can easily create
multiple pools. For now there will be two pools, one for hash jobs and
one for everything else. If anyone (ok, Arvid) has conflicting plans let
me know.


[1] https://github.com/arvidn/libtorrent/wiki/memory-mapped-I-O
Arvid Norberg
2016-05-08 19:11:07 UTC
Permalink
Post by Steven Siloti
As a first step towards implementing the new memory mapped I/O[1] design
I'm going to start working on a scaleable thread pool. It'll be a
separate class from disk_io_thread which will also hold the job queue
and related synchronization primitives so that we can easily create
multiple pools. For now there will be two pools, one for hash jobs and
one for everything else. If anyone (ok, Arvid) has conflicting plans let
me know.
What about separating reading and writing into separate threads?
Post by Steven Siloti
[1] https://github.com/arvidn/libtorrent/wiki/memory-mapped-I-O
The only concern I would have starting at the thread pool-end would be that
it would likely (at least as far as I can tell) take a long time to get to
a state that can replace the current thread pool.

My idea (I wouldn't really call it a plan yet) was to mutate the current
disk_io_thread into the new architecture. The idea being that it's easier
to remove and replace code gradually than to start from scratch. Granted,
there can be benefits with starting from scratch and I think some aspects
should probably be done from scratch. However, having something that works
while working on it is really useful. It allows for incremental testing and
possibly merging interim versions to master.

I would be especially concerned that there may be some subtle problems
that's currently solved, that would need some thought (but haven't been
discovered yet) to fit into the new system. For instance:

1. the job fence mechanism right now, for locking torrents during certain
operations.
2. the block_cache (or something similar to it, and probably a lot simpler)
would be used for the job queue itself. It's possible this could be put of
as an optimization though.

I was going to list the mechanism for posting back job completion handlers
to the main thread too (which you recently touched). But I suppose this
would be significantly simpler now too, since jobs would never be deferred
(as they are now, hanging on pieces in the cache).

I imagine the disk_io_thread interface could remain more or less intact, at
least for a long time, while gutting and replacing the implementation. I
believe that would help contain the changes and break it down into smaller
and more manageable patches. I do expect the current storage_interface
would have to go entirely. I think it would still make sense to have some
customization point for storage, but I think the disk_io_thread interface
may be more appropriate for that going forward.

1. scrap storage_interface (but probably preserve the fundamental piece <->
file mapping in default_storage)
2. the file_pool would morph into a mapping_pool, keeping an LRU (or
something) of mapped areas of files. (perhaps this could be put off as an
optimization as well. However, I believe python on windows is still running
as a 32 bit process)


To put things in perspective though, I would greatly appreciate any work
you put in towards this, so if you think my arguments are weak, please feel
free to ignore them!
--
Arvid Norberg
Steven Siloti
2016-05-08 20:15:03 UTC
Permalink
<snip>
What about separating reading and writing into separate threads?
That can come later. The thread pool class will be agnostic to the type
of jobs it's running so changing the job -> pool mapping won't impact it.
The only concern I would have starting at the thread pool-end would be that
it would likely (at least as far as I can tell) take a long time to get to
a state that can replace the current thread pool.
My idea (I wouldn't really call it a plan yet) was to mutate the current
disk_io_thread into the new architecture. The idea being that it's easier
to remove and replace code gradually than to start from scratch. Granted,
there can be benefits with starting from scratch and I think some aspects
should probably be done from scratch. However, having something that works
while working on it is really useful. It allows for incremental testing and
possibly merging interim versions to master.
I think you're imagining something a bit wider in scope than what I'm
proposing. I would make only minimal modifications to disk_io_thread to
account for m_threads, m_queued_jobs, m_queued_hash_jobs, and associated
members being moved to a new class which disk_io_thread would hold two
instances of in place of the two existing job queues.
I would be especially concerned that there may be some subtle problems
that's currently solved, that would need some thought (but haven't been
1. the job fence mechanism right now, for locking torrents during certain
operations.
2. the block_cache (or something similar to it, and probably a lot simpler)
would be used for the job queue itself. It's possible this could be put of
as an optimization though.
I wasn't planning on touching these beyond changes to account for moving
the aforementioned members into a separate class.
<snip>
I imagine the disk_io_thread interface could remain more or less intact, at
least for a long time, while gutting and replacing the implementation. I
believe that would help contain the changes and break it down into smaller
and more manageable patches. I do expect the current storage_interface
would have to go entirely. I think it would still make sense to have some
customization point for storage, but I think the disk_io_thread interface
may be more appropriate for that going forward.
1. scrap storage_interface (but probably preserve the fundamental piece <->
file mapping in default_storage)
2. the file_pool would morph into a mapping_pool, keeping an LRU (or
something) of mapped areas of files. (perhaps this could be put off as an
optimization as well. However, I believe python on windows is still running
as a 32 bit process)
Agreed.
To put things in perspective though, I would greatly appreciate any work
you put in towards this, so if you think my arguments are weak, please feel
free to ignore them!
The other option I was considering was starting at the lowest level and
modifying the file class to use mmapped I/O. For the initial version I
would forgo a pool of mmapped regions and just have a full mapping of
each file stored alongside the file handle. The next step would be to
add a mapping_pool because an unfortunate number of people are still
using 32-bit builds :( If you would prefer me go down this path that's
fine, I don't feel strongly about it either way.
Arvid Norberg
2016-05-08 23:23:29 UTC
Permalink
Post by Steven Siloti
Post by Arvid Norberg
[...]
My idea (I wouldn't really call it a plan yet) was to mutate the current
disk_io_thread into the new architecture. The idea being that it's easier
to remove and replace code gradually than to start from scratch. Granted,
there can be benefits with starting from scratch and I think some aspects
should probably be done from scratch. However, having something that
works
Post by Arvid Norberg
while working on it is really useful. It allows for incremental testing
and
Post by Arvid Norberg
possibly merging interim versions to master.
I think you're imagining something a bit wider in scope than what I'm
proposing. I would make only minimal modifications to disk_io_thread to
account for m_threads, m_queued_jobs, m_queued_hash_jobs, and associated
members being moved to a new class which disk_io_thread would hold two
instances of in place of the two existing job queues.
I see. the thread pools would still fit into what's there right now then?
And perhaps even be mergeable into master.
Post by Steven Siloti
Post by Arvid Norberg
I would be especially concerned that there may be some subtle problems
that's currently solved, that would need some thought (but haven't been
1. the job fence mechanism right now, for locking torrents during certain
operations.
2. the block_cache (or something similar to it, and probably a lot
simpler)
Post by Arvid Norberg
would be used for the job queue itself. It's possible this could be put
of
Post by Arvid Norberg
as an optimization though.
I wasn't planning on touching these beyond changes to account for moving
the aforementioned members into a separate class.
Post by Arvid Norberg
[...]
To put things in perspective though, I would greatly appreciate any work
you put in towards this, so if you think my arguments are weak, please
feel
Post by Arvid Norberg
free to ignore them!
The other option I was considering was starting at the lowest level and
modifying the file class to use mmapped I/O. For the initial version I
would forgo a pool of mmapped regions and just have a full mapping of
each file stored alongside the file handle. The next step would be to
add a mapping_pool because an unfortunate number of people are still
using 32-bit builds :( If you would prefer me go down this path that's
fine, I don't feel strongly about it either way.
I don't have a strong opinion. It sounds like the two approaches are
orthogonal.
--
Arvid Norberg
Steven Siloti
2016-05-08 23:43:45 UTC
Permalink
Post by Arvid Norberg
Post by Steven Siloti
I think you're imagining something a bit wider in scope than what I'm
proposing. I would make only minimal modifications to disk_io_thread to
account for m_threads, m_queued_jobs, m_queued_hash_jobs, and associated
members being moved to a new class which disk_io_thread would hold two
instances of in place of the two existing job queues.
I see. the thread pools would still fit into what's there right now then?
And perhaps even be mergeable into master.
Yes, that's the idea.
Arvid Norberg
2016-05-09 05:55:32 UTC
Permalink
Post by Steven Siloti
Post by Arvid Norberg
Post by Steven Siloti
I think you're imagining something a bit wider in scope than what I'm
proposing. I would make only minimal modifications to disk_io_thread to
account for m_threads, m_queued_jobs, m_queued_hash_jobs, and associated
members being moved to a new class which disk_io_thread would hold two
instances of in place of the two existing job queues.
I see. the thread pools would still fit into what's there right now then?
And perhaps even be mergeable into master.
Yes, that's the idea.
I envision the job queue itself to, on top of being a queue, also have an
index by (torrent,piece,block) to support cache hits in the queue.
Presumably the index would only be provided for write jobs. With this in
mind, it may make sense to have the thread pool be parameterized on the
queue type it uses, and make the queue abstraction itself provide mutual
exclusion.

Putting off supporting cache hits in the queue may be difficult as well,
since guaranteeing correct ordering of write jobs and subsequent read jobs
is hard with multiple disk threads. Guaranteeing ordering in the other
direction is fortunately not important (i.e. a read chased by a write).
--
Arvid Norberg
Loading...