Journal Article A framework for accelerating bottlenecks in GPU execution with assist warps 2016 372-415 Vijaykumar N, Pekhimenko G, Jog A, Ghose S, Bhowmick A, Ausavarungnirun R, Das C, Kandemir M, Mowry TC, Mutlu O
Preprint A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps 2016 Vijaykumar N, Pekhimenko G, Jog A, Ghose S, Bhowmick A, Ausavarangnirun R, Das C, Kandemir M, Mowry TC, Mutlu O
Journal Article Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM 2016 Seshadri V, Lee D, Mullins T, Hassan H, Boroumand A, Kim J, Kozuch MA, Mutlu O, Gibbons PB, Mowry TC
Journal Article Mitigating the Memory Bottleneck With Approximate Load Value Prediction 2016 • IEEE Design and Test • 33(1):32-42 Yazdanbakhsh A, Thwaites B, Esmaeilzadeh H, Pekhimenko G, Mutlu O, Mowry TC
Journal Article Page overlays 2016 • Computer architecture news • 43(3S):79-91 Seshadri V, Pekhimenko G, Ruwase O, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC, Chilimbi T
Journal Article RFVP: Rollback-Free Value Prediction with Safe-to-Approximate Loads 2016 • ACM Transactions on Architecture and Code Optimization (TACO) • 12(4): Yazdanbakhsh A, Pekhimenko G, Thwaites B, Esmaeilzadeh H, Mutlu O, Mowry TC
Conference A Case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling Flexible Data Compression with Assist Warps 2015 • Proceedings / Annual International Symposium on Computer Architecture. International Symposium on Computer Architecture • 41-53 Vijaykumar N, Pekhimenko G, Jog A, Bhowmick A, Ausavarungnirun R, Das C, Kandemir M, Mowry TC, Mutlu O
Conference Exploiting Compressed Block Size as an Indicator of Future Reuse 2015 • IEEE Symposium on High-Performance Computer Architecture (HPCA) • 51-63 Pekhimenko G, Huberty T, Cai R, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC
Journal Article Fast Bulk Bitwise AND and OR in DRAM 2015 • IEEE Computer Architecture Letters • 14(2):127-131 Seshadri V, Hsieh K, Boroum A, Lee D, Kozuch MA, Mutlu O, Gibbons PB, Mowry TC
Conference Gather-Scatter DRAM: In-DRAM Address Translation to Improve the Spatial Locality of Non-unit Strided Accesses 2015 • Micro -Annual Workshop then Annual International Symposium- • 267-280 Seshadri V, Mullins T, Boroumand A, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC
Conference Page Overlays: An Enhanced Virtual Memory Framework to Enable Fine-grained Memory Management 2015 • Proceedings / Annual International Symposium on Computer Architecture. International Symposium on Computer Architecture • 79-91 Seshadri V, Pekhimenko G, Ruwase O, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC, Chilimbi T
Journal Article Toggle-Aware Compression for GPUs 2015 • IEEE Computer Architecture Letters • 14(2):164-168 Pekhimenko G, Bolotin E, O'Connor M, Mutlu O, Mowry TC, Keckler SW
Conference Tracking and Reducing Uncertainty in Dataflow Analysis-Based Dynamic Parallel Monitoring 2015 • Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT • 266-279 Goodstein ML, Gibbons PB, Kozuch MA, Mowry TC
Journal Article Guardrail 2014 • Computer architecture news • 42(1):655-670 Ruwase O, Kozuch MA, Gibbons PB, Mowry TC
Journal Article Guardrail 2014 • ACM Sigplan Notices • 49(4):655-670 Ruwase O, Kozuch MA, Gibbons PB, Mowry TC
Conference Guardrail: A High Fidelity Approach to Protecting Hardware Devices from Buggy Drivers 2014 • ACM Sigplan Notices • 49(4):655-669 Ruwase O, Kozuch MA, Gibbons PB, Mowry TC
Journal Article Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks 2014 • ACM Transactions on Architecture and Code Optimization (TACO) • 11(4): Seshadri V, Yedkar S, Xin H, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC
Conference Rollback-Free Value Prediction with Approximate Loads 2014 • Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT • 493-494 Thwaites B, Pekhimenko G, Esmaeilzadeh H, Yazdanbakhsh A, Mutlu O, Park J, Mururu G, Mowry T
Conference The Dirty-Block Index 2014 • Proceedings / Annual International Symposium on Computer Architecture. International Symposium on Computer Architecture • 157-168 Seshadri V, Bhowmick A, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC
Journal Article The dirty-block index 2014 • Computer architecture news • 42(3):157-168 Seshadri V, Bhowmick A, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC
Conference Linearly compressed pages: A low-complexity, low-latency main memory compression framework 2013 • Micro -Santa Monica- • 172-184 Pekhimenko G, Seshadri V, Kim Y, Xin H, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC
Conference RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization 2013 • Micro -Santa Monica- • 185-197 Seshadri V, Kim Y, Fallin C, Lee D, Ausavarungnirun R, Pekhimenko G, Luo Y, Mutlu O, Gibbons PB, Kozuch MA, Mowry TC
Conference Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches 2012 • Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT • 377-388 Pekhimenko G, Seshadri V, Mutlu O, Kozuch MA, Gibbons PB, Mowry TC
Conference Chrysalis Analysis: Incorporating Synchronization Arcs in Dataflow-Analysis-Based Parallel Monitoring 2012 • Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT • 201-212 Goodstein ML, Chen S, Gibbons PB, Kozuch MA, Mowry TC