A potential bug to squeeze extra memory through command line arguments

Wed Mar 2 23:52:24 UTC 2016

On Tue, Mar 01, 2016 at 03:46:12PM +0530, haris iqbal wrote:
> On Tue, Mar 1, 2016 at 2:01 PM, Dmitry V. Levin <ldv at altlinux.org> wrote:
> > On Tue, Mar 01, 2016 at 10:08:16AM +0530, haris iqbal wrote:
> >> On Tue, Mar 1, 2016 at 5:09 AM, Dmitry V. Levin <ldv at altlinux.org> wrote:
> >> >
> >> > On Wed, Feb 24, 2016 at 06:02:01PM +0530, haris iqbal wrote:
> >> > [...]
> >> > > Ok. I have come up with a separate memory model for tcbtab. In this
> >> > > model, we will use a linked list instead of a global array of pointers
> >> > > tcbtab.
> >> > >
> >> > > The structure
> >> > >
> >> > > struct s_tcbtab
> >> > > {
> >> > >     struct tcb* data;
> >> > >     struct s_tcbtab* next;
> >> > > }
> >> > >
> >> > > And a global head of the linked list.
> >> > >
> >> > > struct s_tcbtab* head_tcbtab = NULL;
> >> >
> >> > I suppose this memory model is better for some use cases and worse for
> >> > some other use cases.
> >> > What kind of strace usage would win/lose from this change?
> >>
> >> quick question. How many pids can we give with the -p option?
> >
> > It depends on the maximum length of the arguments to execve(2),
> > which varies between systems.
> 
> One disadvantage I can think of in when there are a substantial number
> of pids with the -p option. Then the new proposed model would iterate
> over the linked list till the end while allocating a struct for every
> pid. Thus the time taken for n pids would be (n * ( n + 1)) / 2.
> 
> The same case would happen with the older model also. But the
> difference would be, the older one using an array (thus branch
> prediction would help, and linear access of memory being an
> advantage), and the newer model would be using linked list thus would
> be slower. But I think there won't be a visible time difference for a
> few pids with the -p option.
> 
> Is there any other disadvantage for the newer model that I missed?

The only advantage of the linked list model seems to be memory footprint
since it doesn't allocate memory in advance, however, memory fragmentation
might easily negate this effect.  Iteration is going to be slower, and
strace iterates a lot via pid2tcb(), so a switch to this model would
result to a measurable performance regression.

-- 
ldv
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.strace.io/pipermail/strace-devel/attachments/20160303/665fca6a/attachment.bin>