[llvm-dev] RFC: Should SmallVectors be smaller?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
I've been curious for a while whether SmallVectors have the right speed/memory tradeoff.  It would be straightforward to shave off a couple of pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could afford to test for small-mode vs. large-mode.

The current scheme works out to something like this:
```
template <class T, size_t SmallCapacity>
struct SmallVector {
  T *BeginX, *EndX, *CapacityX;
  T Small[SmallCapacity];

  bool isSmall() const { return BeginX == Small; }
  T *begin() { return BeginX; }
  T *end() { return EndX; }
  size_t size() const { return EndX - BeginX; }
  size_t capacity() const { return CapacityX - BeginX; }
};
```

In the past I used something more like:
```
template <class T, size_t SmallCapacity>
struct SmallVector2 {
  unsigned Size;
  unsigned Capacity;
  union {
    T Small[SmallCapacity];
    T *Large;
  };

  bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit shaved off of Capacity.
  T *begin() { return isSmall() ? Small : Large; }
  T *end() { return begin() + Size; }
  size_t size() const { return Size; }
  size_t capacity() const { return Capacity; }
};
```

I'm curious whether this scheme would be really be slower in practice (as a complete replacement for `SmallVector` in ADT).  I wonder, has anyone profiled something like this before?  If so, in what context?  on what workloads?

Duncan
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev


> On Jun 21, 2018, at 9:52 AM, Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:
>
> I've been curious for a while whether SmallVectors have the right speed/memory tradeoff.  It would be straightforward to shave off a couple of pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could afford to test for small-mode vs. large-mode.

Something like this could definitely work, but most smallvectors are on the stack.  They are intentionally used when sizeof(smallvector) isn’t important, so I don’t think this optimization will pay off.

Out of curiosity, what brings this up?

-Chris


>
> The current scheme works out to something like this:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector {
>  T *BeginX, *EndX, *CapacityX;
>  T Small[SmallCapacity];
>
>  bool isSmall() const { return BeginX == Small; }
>  T *begin() { return BeginX; }
>  T *end() { return EndX; }
>  size_t size() const { return EndX - BeginX; }
>  size_t capacity() const { return CapacityX - BeginX; }
> };
> ```
>
> In the past I used something more like:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector2 {
>  unsigned Size;
>  unsigned Capacity;
>  union {
>    T Small[SmallCapacity];
>    T *Large;
>  };
>
>  bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit shaved off of Capacity.
>  T *begin() { return isSmall() ? Small : Large; }
>  T *end() { return begin() + Size; }
>  size_t size() const { return Size; }
>  size_t capacity() const { return Capacity; }
> };
> ```
>
> I'm curious whether this scheme would be really be slower in practice (as a complete replacement for `SmallVector` in ADT).  I wonder, has anyone profiled something like this before?  If so, in what context?  on what workloads?
>
> Duncan
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
In reply to this post by Muhui Jiang via llvm-dev


> On 22 Jun 2018, at 02:52, Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:
>
> I've been curious for a while whether SmallVectors have the right speed/memory tradeoff.  It would be straightforward to shave off a couple of pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could afford to test for small-mode vs. large-mode.
>
> The current scheme works out to something like this:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector {
>  T *BeginX, *EndX, *CapacityX;
>  T Small[SmallCapacity];
>
>  bool isSmall() const { return BeginX == Small; }
>  T *begin() { return BeginX; }
>  T *end() { return EndX; }
>  size_t size() const { return EndX - BeginX; }
>  size_t capacity() const { return CapacityX - BeginX; }
> };
> ```
>
> In the past I used something more like:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector2 {
>  unsigned Size;
>  unsigned Capacity;
>  union {
>    T Small[SmallCapacity];
>    T *Large;
>  };
>
>  bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit shaved off of Capacity.
>  T *begin() { return isSmall() ? Small : Large; }
>  T *end() { return begin() + Size; }
>  size_t size() const { return Size; }
>  size_t capacity() const { return Capacity; }
> };
> ```
>
> I'm curious whether this scheme would be really be slower in practice (as a complete replacement for `SmallVector` in ADT).  I wonder, has anyone profiled something like this before?  If so, in what context?  on what workloads?
>

Doesn’t this scheme have a problem with undefined behaviour, since you may be changing the active member of the union when capacity grows larger than SmallCapacity?

-- Dean

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
To Dean,

I think Duncan’s approach prohibit any usage of Small after the capacity grow over SmallCapacity.
So when the capacity exceed SmallCapacity, one should:
1. Allocate memory on heap
2. Copy data from  Small to that chunk
3. Assign pointer of that chunk to Large

As long as you always access Large after growth, there would be no data lose.

Bekket

> Dean Michael Berris via llvm-dev <[hidden email]> 於 2018年6月21日 下午10:01 寫道:
>
>
>
>> On 22 Jun 2018, at 02:52, Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:
>>
>> I've been curious for a while whether SmallVectors have the right speed/memory tradeoff.  It would be straightforward to shave off a couple of pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could afford to test for small-mode vs. large-mode.
>>
>> The current scheme works out to something like this:
>> ```
>> template <class T, size_t SmallCapacity>
>> struct SmallVector {
>> T *BeginX, *EndX, *CapacityX;
>> T Small[SmallCapacity];
>>
>> bool isSmall() const { return BeginX == Small; }
>> T *begin() { return BeginX; }
>> T *end() { return EndX; }
>> size_t size() const { return EndX - BeginX; }
>> size_t capacity() const { return CapacityX - BeginX; }
>> };
>> ```
>>
>> In the past I used something more like:
>> ```
>> template <class T, size_t SmallCapacity>
>> struct SmallVector2 {
>> unsigned Size;
>> unsigned Capacity;
>> union {
>>   T Small[SmallCapacity];
>>   T *Large;
>> };
>>
>> bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit shaved off of Capacity.
>> T *begin() { return isSmall() ? Small : Large; }
>> T *end() { return begin() + Size; }
>> size_t size() const { return Size; }
>> size_t capacity() const { return Capacity; }
>> };
>> ```
>>
>> I'm curious whether this scheme would be really be slower in practice (as a complete replacement for `SmallVector` in ADT).  I wonder, has anyone profiled something like this before?  If so, in what context?  on what workloads?
>>
>
> Doesn’t this scheme have a problem with undefined behaviour, since you may be changing the active member of the union when capacity grows larger than SmallCapacity?
>
> -- Dean
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev


> On 22 Jun 2018, at 12:30, Bekket McClane <[hidden email]> wrote:
>
> To Dean,
>
> I think Duncan’s approach prohibit any usage of Small after the capacity grow over SmallCapacity.
> So when the capacity exceed SmallCapacity, one should:
> 1. Allocate memory on heap
> 2. Copy data from  Small to that chunk
> 3. Assign pointer of that chunk to Large
>
> As long as you always access Large after growth, there would be no data lose.
>

Ah, yes. That makes sense.

I’m curious to see, like Chris, what the benchmarks say about this alternative approach.

-- Dean

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
In reply to this post by Muhui Jiang via llvm-dev


On Jun 21, 2018, at 18:38, Chris Lattner <[hidden email]> wrote:



On Jun 21, 2018, at 9:52 AM, Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:

I've been curious for a while whether SmallVectors have the right speed/memory tradeoff.  It would be straightforward to shave off a couple of pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could afford to test for small-mode vs. large-mode.

Something like this could definitely work, but most smallvectors are on the stack.  They are intentionally used when sizeof(smallvector) isn’t important, so I don’t think this optimization will pay off.

For better or worse (mostly worse), there are a ton of SmallVector fields in data structures, including some even nested inside other SmallVectors (e.g., see the cleanup in r235229).  Often these data structures are heap-allocated.

Out of curiosity, what brings this up?

I've noticed that Clang is using more stack recently (we're seeing more crashes from template recursion; it seems the template recursion limit needs to shrink), and somehow that train of thought led to this.

I share your skepticism that it will help stack usage much, but SmallVector/SmallVectorImpl is so ubiquitous, it could help the heap a bit.  And if it doesn’t hurt runtime performance in practice, there’s no reason to fork the data structure.

If no one has measured before I might try it some time. 

-Chris



The current scheme works out to something like this:
```
template <class T, size_t SmallCapacity>
struct SmallVector {
T *BeginX, *EndX, *CapacityX;
T Small[SmallCapacity];

bool isSmall() const { return BeginX == Small; }
T *begin() { return BeginX; }
T *end() { return EndX; }
size_t size() const { return EndX - BeginX; }
size_t capacity() const { return CapacityX - BeginX; }
};
```

In the past I used something more like:
```
template <class T, size_t SmallCapacity>
struct SmallVector2 {
unsigned Size;
unsigned Capacity;
union {
  T Small[SmallCapacity];
  T *Large;
};

bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit shaved off of Capacity.
T *begin() { return isSmall() ? Small : Large; }
T *end() { return begin() + Size; }
size_t size() const { return Size; }
size_t capacity() const { return Capacity; }
};
```

I'm curious whether this scheme would be really be slower in practice (as a complete replacement for `SmallVector` in ADT).  I wonder, has anyone profiled something like this before?  If so, in what context?  on what workloads?

Duncan
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
On Thu, Jun 21, 2018 at 9:16 PM Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:


On Jun 21, 2018, at 18:38, Chris Lattner <[hidden email]> wrote:



On Jun 21, 2018, at 9:52 AM, Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:

I've been curious for a while whether SmallVectors have the right speed/memory tradeoff.  It would be straightforward to shave off a couple of pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could afford to test for small-mode vs. large-mode.

Something like this could definitely work, but most smallvectors are on the stack.  They are intentionally used when sizeof(smallvector) isn’t important, so I don’t think this optimization will pay off.

For better or worse (mostly worse), there are a ton of SmallVector fields in data structures, including some even nested inside other SmallVectors (e.g., see the cleanup in r235229).  Often these data structures are heap-allocated.

Yes, this is a huge problem. We seriously overuse SmallVector. I think in CodeViewDebug.cpp we had a DenseMap of a struct which had a SmallVector of structs that contained SmallVectors. It was silly.

Out of curiosity, what brings this up?

I've noticed that Clang is using more stack recently (we're seeing more crashes from template recursion; it seems the template recursion limit needs to shrink), and somehow that train of thought led to this.

I share your skepticism that it will help stack usage much, but SmallVector/SmallVectorImpl is so ubiquitous, it could help the heap a bit.  And if it doesn’t hurt runtime performance in practice, there’s no reason to fork the data structure.

If no one has measured before I might try it some time. 

I think it's important to keep begin(), end(), and indexing operations branchless, so I'm not sure this pointer union is the best idea. I haven't profiled, but that's my intuition. If you wanted to limit all our vectors to 4 billion elements to save a pointer, I'd probably be fine with that.

I think we might be better off just reducing the pre-allocation size of most of our SmallVectors across LLVM and Clang. They're all wild guesses, never profiled. Especially for vectors of relatively "large" elements, the pre-allocation optimization just doesn't make that much sense. I'd go as far as to suggest providing a default SmallVector N value of something like `sizeof(void*) * 3 / sizeof(T)`, i.e. by default, every SmallVector is at most 6 pointers big.

---

Relatedly, there's a lot of work that can be done to tune DenseMap. When the key and value pair is relatively large, we waste a lot of space on empty table slots.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev


On Jun 22, 2018, at 15:18, Reid Kleckner <[hidden email]> wrote:

On Thu, Jun 21, 2018 at 9:16 PM Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:
Out of curiosity, what brings this up?

I've noticed that Clang is using more stack recently (we're seeing more crashes from template recursion; it seems the template recursion limit needs to shrink), and somehow that train of thought led to this.

I share your skepticism that it will help stack usage much, but SmallVector/SmallVectorImpl is so ubiquitous, it could help the heap a bit.  And if it doesn’t hurt runtime performance in practice, there’s no reason to fork the data structure.

If no one has measured before I might try it some time. 

I think it's important to keep begin(), end(), and indexing operations branchless, so I'm not sure this pointer union is the best idea. I haven't profiled, but that's my intuition. If you wanted to limit all our vectors to 4 billion elements to save a pointer, I'd probably be fine with that.

Good point, there are two separable changes here and only the union part is likely to have compile-time slowdowns.  I threw together https://reviews.llvm.org/D48518 (currently building with ASan to run check-llvm) and the surely uncontroversial https://reviews.llvm.org/D48516.

I think we might be better off just reducing the pre-allocation size of most of our SmallVectors across LLVM and Clang. They're all wild guesses, never profiled. Especially for vectors of relatively "large" elements, the pre-allocation optimization just doesn't make that much sense. I'd go as far as to suggest providing a default SmallVector N value of something like `sizeof(void*) * 3 / sizeof(T)`, i.e. by default, every SmallVector is at most 6 pointers big.

Interesting idea... and then audit current instances to drop the size argument.

Note that a SmallVector with N value of 0 takes the same storage as an N value of 1, so very large sizeof(T) would still use more than 6 pointers.  The cause is that SmallVectorTemplateCommon stores the first element so that it can detect small mode by comparing BeginX against &FirstEl.  The fix would be to shave a bit off of capacity (dropping max capacity to 2B)... likely reasonable.

If we're going to audit anyway, I wonder if forking names would make sense.  E.g., the current thing would be less tempting to use in data structures if it were called StackVector.  But that wouldn't be a fun change to roll out across sub-projects.

---

Relatedly, there's a lot of work that can be done to tune DenseMap. When the key and value pair is relatively large, we waste a lot of space on empty table slots.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
In reply to this post by Muhui Jiang via llvm-dev
On Jun 21, 2018, at 9:16 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:

Something like this could definitely work, but most smallvectors are on the stack.  They are intentionally used when sizeof(smallvector) isn’t important, so I don’t think this optimization will pay off.

For better or worse (mostly worse), there are a ton of SmallVector fields in data structures, including some even nested inside other SmallVectors (e.g., see the cleanup in r235229).  Often these data structures are heap-allocated.

Egads, that seems like a big issue, and (most of the time, but perhaps not always) a misuse of SmallVector.  It seems that this should be fixed, instead of changing smallvector.

Out of curiosity, what brings this up?

I've noticed that Clang is using more stack recently (we're seeing more crashes from template recursion; it seems the template recursion limit needs to shrink), and somehow that train of thought led to this.

I share your skepticism that it will help stack usage much, but SmallVector/SmallVectorImpl is so ubiquitous, it could help the heap a bit.  And if it doesn’t hurt runtime performance in practice, there’s no reason to fork the data structure.

If no one has measured before I might try it some time. 

I’m not familiar with the modern structure of the code, but is there any chance these algorithms can be reworked to be iterative instead of recursive?  Shrinking individual frames may buy some time, but isn’t really a fix.

-Chris


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
In reply to this post by Muhui Jiang via llvm-dev


On Jun 23, 2018, at 9:11 AM, Duncan P. N. Exon Smith <[hidden email]> wrote:


I think we might be better off just reducing the pre-allocation size of most of our SmallVectors across LLVM and Clang. They're all wild guesses, never profiled. Especially for vectors of relatively "large" elements, the pre-allocation optimization just doesn't make that much sense. I'd go as far as to suggest providing a default SmallVector N value of something like `sizeof(void*) * 3 / sizeof(T)`, i.e. by default, every SmallVector is at most 6 pointers big.

Interesting idea... and then audit current instances to drop the size argument.

Note that a SmallVector with N value of 0 takes the same storage as an N value of 1, so very large sizeof(T) would still use more than 6 pointers.  The cause is that SmallVectorTemplateCommon stores the first element so that it can detect small mode by comparing BeginX against &FirstEl.  The fix would be to shave a bit off of capacity (dropping max capacity to 2B)... likely reasonable.

The patch LGTM, but why would someone actually have a SmallVector with N = 0?  Isn’t that a vector?

Also if you’re not familiar with it, TinyPtrVector is a very useful type for vectors that are highly biased towards 0/1 element and whose elements are pointer size.  It was added relatively late in LLVM’s evolution, so I wouldn’t be surprised if there are still smallvectors that should be upgraded.  TinyPtrVector is designed for use on the heap.

-Chris



If we're going to audit anyway, I wonder if forking names would make sense.  E.g., the current thing would be less tempting to use in data structures if it were called StackVector.  But that wouldn't be a fun change to roll out across sub-projects.



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev


On Jun 23, 2018, at 10:14, Chris Lattner <[hidden email]> wrote:



On Jun 23, 2018, at 9:11 AM, Duncan P. N. Exon Smith <[hidden email]> wrote:


I think we might be better off just reducing the pre-allocation size of most of our SmallVectors across LLVM and Clang. They're all wild guesses, never profiled. Especially for vectors of relatively "large" elements, the pre-allocation optimization just doesn't make that much sense. I'd go as far as to suggest providing a default SmallVector N value of something like `sizeof(void*) * 3 / sizeof(T)`, i.e. by default, every SmallVector is at most 6 pointers big.

Interesting idea... and then audit current instances to drop the size argument.

Note that a SmallVector with N value of 0 takes the same storage as an N value of 1, so very large sizeof(T) would still use more than 6 pointers.  The cause is that SmallVectorTemplateCommon stores the first element so that it can detect small mode by comparing BeginX against &FirstEl.  The fix would be to shave a bit off of capacity (dropping max capacity to 2B)... likely reasonable.

The patch LGTM, but why would someone actually have a SmallVector with N = 0?  Isn’t that a vector?

It's a vector that can be passed as a SmallVectorImpl parameter.  But yeah, mostly misguided.

Also if you’re not familiar with it, TinyPtrVector is a very useful type for vectors that are highly biased towards 0/1 element and whose elements are pointer size.  It was added relatively late in LLVM’s evolution, so I wouldn’t be surprised if there are still smallvectors that should be upgraded.  TinyPtrVector is designed for use on the heap.

Yup, it's great for pointers.  Maybe we should make a TinyVector for non-pointers and call it a day.

If we're going to audit anyway, I wonder if forking names would make sense.  E.g., the current thing would be less tempting to use in data structures if it were called StackVector.  But that wouldn't be a fun change to roll out across sub-projects.




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev


On Jun 23, 2018, at 11:27, Duncan P. N. Exon Smith <[hidden email]> wrote:

The patch LGTM, but why would someone actually have a SmallVector with N = 0?  Isn’t that a vector?

It's a vector that can be passed as a SmallVectorImpl parameter.  But yeah, mostly misguided.


There's another explanation given in llvm/include/llvm/ADT/IndexedMap.h:
```
template <typename T, typename ToIndexT = identity<unsigned>>
  class IndexedMap {
    using IndexT = typename ToIndexT::argument_type;
    // Prefer SmallVector with zero inline storage over std::vector. IndexedMaps
    // can grow very large and SmallVector grows more efficiently as long as T
    // is trivially copyable.
    using StorageT = SmallVector<T, 0>;

```
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev


On Jun 23, 2018, at 11:35, Duncan P. N. Exon Smith <[hidden email]> wrote:



On Jun 23, 2018, at 11:27, Duncan P. N. Exon Smith <[hidden email]> wrote:

The patch LGTM, but why would someone actually have a SmallVector with N = 0?  Isn’t that a vector?

It's a vector that can be passed as a SmallVectorImpl parameter.  But yeah, mostly misguided.


There's another explanation given in llvm/include/llvm/ADT/IndexedMap.h:
```
template <typename T, typename ToIndexT = identity<unsigned>>
  class IndexedMap {
    using IndexT = typename ToIndexT::argument_type;
    // Prefer SmallVector with zero inline storage over std::vector. IndexedMaps
    // can grow very large and SmallVector grows more efficiently as long as T
    // is trivially copyable.
    using StorageT = SmallVector<T, 0>;

```

And another explanation in r266909, along the same lines:

    IR: Use SmallVector instead of std::vector of TrackingMDRef
    
    Don't use std::vector<TrackingMDRef>, since (at least in some versions
    of libc++) std::vector apparently copies values on grow operations
    instead of moving them.  Found this when I was temporarily deleting the
    copy constructor for TrackingMDRef to investigate a performance
    bottleneck.



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev


On Jun 23, 2018, at 11:48 AM, Duncan P. N. Exon Smith <[hidden email]> wrote:



On Jun 23, 2018, at 11:35, Duncan P. N. Exon Smith <[hidden email]> wrote:



On Jun 23, 2018, at 11:27, Duncan P. N. Exon Smith <[hidden email]> wrote:

The patch LGTM, but why would someone actually have a SmallVector with N = 0?  Isn’t that a vector?

It's a vector that can be passed as a SmallVectorImpl parameter.  But yeah, mostly misguided.


There's another explanation given in llvm/include/llvm/ADT/IndexedMap.h:
```
template <typename T, typename ToIndexT = identity<unsigned>>
  class IndexedMap {
    using IndexT = typename ToIndexT::argument_type;
    // Prefer SmallVector with zero inline storage over std::vector. IndexedMaps
    // can grow very large and SmallVector grows more efficiently as long as T
    // is trivially copyable.
    using StorageT = SmallVector<T, 0>;

```

And another explanation in r266909, along the same lines:

    IR: Use SmallVector instead of std::vector of TrackingMDRef
    
    Don't use std::vector<TrackingMDRef>, since (at least in some versions
    of libc++) std::vector apparently copies values on grow operations
    instead of moving them.  Found this when I was temporarily deleting the
    copy constructor for TrackingMDRef to investigate a performance
    bottleneck.

Huh, ok, still seems dubious, but I’m certainly not opposed to making SmallVector<0> better!  Thanks Duncan,

-Chris


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
In reply to this post by Muhui Jiang via llvm-dev


On Jun 23, 2018, at 11:27 AM, Duncan P. N. Exon Smith <[hidden email]> wrote:


Also if you’re not familiar with it, TinyPtrVector is a very useful type for vectors that are highly biased towards 0/1 element and whose elements are pointer size.  It was added relatively late in LLVM’s evolution, so I wouldn’t be surprised if there are still smallvectors that should be upgraded.  TinyPtrVector is designed for use on the heap.

Yup, it's great for pointers.  Maybe we should make a TinyVector for non-pointers and call it a day.

Sure.  The nice thing about TinyPtrVector is that sizeof(TinyPtrVector) is sizeof(void*) through bitstealing, which you can’t achieve with an arbitrary T.

-Chris


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
(add back llvm-dev)

On Jun 23, 2018, at 16:16, Duncan P. N. Exon Smith <[hidden email]> wrote:



On Jun 23, 2018, at 16:13, Chris Lattner <[hidden email]> wrote:



On Jun 23, 2018, at 11:27 AM, Duncan P. N. Exon Smith <[hidden email]> wrote:


Also if you’re not familiar with it, TinyPtrVector is a very useful type for vectors that are highly biased towards 0/1 element and whose elements are pointer size.  It was added relatively late in LLVM’s evolution, so I wouldn’t be surprised if there are still smallvectors that should be upgraded.  TinyPtrVector is designed for use on the heap.

Yup, it's great for pointers.  Maybe we should make a TinyVector for non-pointers and call it a day.

Sure.  The nice thing about TinyPtrVector is that sizeof(TinyPtrVector) is sizeof(void*) through bitstealing, which you can’t achieve with an arbitrary T.

Right, it won't be as good the Ptr version for anything sizeof >= 8, but we could fairly blindly move SmallVector with N=0 over to it.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Should SmallVectors be smaller?

Muhui Jiang via llvm-dev
In reply to this post by Muhui Jiang via llvm-dev


On Sat, Jun 23, 2018 at 11:48 AM, Duncan P. N. Exon Smith via llvm-dev <[hidden email]> wrote:


On Jun 23, 2018, at 11:35, Duncan P. N. Exon Smith <[hidden email]> wrote:



On Jun 23, 2018, at 11:27, Duncan P. N. Exon Smith <[hidden email]> wrote:

The patch LGTM, but why would someone actually have a SmallVector with N = 0?  Isn’t that a vector?

It's a vector that can be passed as a SmallVectorImpl parameter.  But yeah, mostly misguided.


There's another explanation given in llvm/include/llvm/ADT/IndexedMap.h:
```
template <typename T, typename ToIndexT = identity<unsigned>>
  class IndexedMap {
    using IndexT = typename ToIndexT::argument_type;
    // Prefer SmallVector with zero inline storage over std::vector. IndexedMaps
    // can grow very large and SmallVector grows more efficiently as long as T
    // is trivially copyable.
    using StorageT = SmallVector<T, 0>;

```

And another explanation in r266909, along the same lines:

    IR: Use SmallVector instead of std::vector of TrackingMDRef
    
    Don't use std::vector<TrackingMDRef>, since (at least in some versions
    of libc++) std::vector apparently copies values on grow operations
    instead of moving them.  Found this when I was temporarily deleting the
    copy constructor for TrackingMDRef to investigate a performance
    bottleneck.



It seems that a POD SmallVector can use realloc(3) to grow, which can just twiddle page tables for large POD vectors.

-- Sean Silva
 

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev