needhelp
← Back to blog

Huawei's Tau (τ) Law: Rewriting Semiconductor Scaling Without Advanced Lithography

by needhelp
Huawei
Semiconductors
AI Chips
Moore's Law
Ascend
Nvidia
US-China
Deep Dive

Date: 2026-05-28 | Reading time: ~25 min

Semiconductor wafer under microscope


Executive Summary

On May 25, 2026, at IEEE ISCAS 2026 in Shanghai, He Tingbo — Huawei’s Semiconductor Business President — unveiled the Tau (τ) Scaling Law. First time a Chinese company has proposed a guiding principle for the global semiconductor industry.

Same week, Huawei’s Ascend 910C800 TFLOPS FP16, roughly 80% of Nvidia’s H100 — is in mass production powering large-scale AI deployments. The upcoming Ascend 910D targets surpassing H100 outright.

Two things happening at once: a new theoretical framework, and chips shipping in volume. This is Huawei’s dual-track answer to US sanctions.

This article covers:

  • Mathematical foundation of the τ Law
  • LogicFolding — 3D chip architecture without advanced lithography
  • Ascend 910C/910D vs. Nvidia H100/H200 benchmarks
  • The escalating US-China chip war

1. Moore’s Law Is Out of Road

For 60 years, Moore’s Law ran the industry: transistor counts double every 18–24 months through geometric miniaturization.

That era is ending. Three walls:

1.1 Physics: Quantum Tunneling

Below 3nm, transistor gates span a few dozen silicon atoms. Electrons tunnel through insulating barriers. Result: uncontrollable leakage, excess heat, instability.

The hard floor is around 1.5nm. Conventional transistors stop working below that.

1.2 Economics: The Money Wall

Process NodeFab InvestmentDesign Cost per Chip
28nm~$6B~$50M
7nm~$15B~$200M
3nm~$20B$500M–$1B
2nm~$28B (projected)>$1B

A single 3nm fab costs nearly $20 billion. A tape-out exceeds $100 million. Only TSMC and Samsung can afford the leading edge. The economic engine that made Moore’s Law self-fulfilling is seizing up.

1.3 Performance: Diminishing Returns

At advanced nodes, leakage power dominates dynamic power. Cost-per-transistor has stopped declining. Performance-per-watt gains shrink with each shrink. The industry needs a new paradigm.


2. The Tau (τ) Law: From Space to Time

2.1 Core Principle

The τ Law reframes semiconductor progress. Instead of spatial density (transistors/mm²), it optimizes temporal efficiency — signal propagation delay across the entire computing stack.

τ (tau) is the time constant in physics. Huawei proposes it as the universal optimization target for the full hierarchy.

2.2 The Math

τ=f(τtransistor,τcircuit,τchip,τsystem)\tau = f(\tau_{\text{transistor}}, \tau_{\text{circuit}}, \tau_{\text{chip}}, \tau_{\text{system}})

Where:

  • $\tau_{\text{transistor}}$ — Intrinsic switching delay (picoseconds)
  • $\tau_{\text{circuit}}$ — RC propagation delay across critical paths
  • $\tau_{\text{chip}}$ — Memory access and on-chip interconnect latency
  • $\tau_{\text{system}}$ — End-to-end message passing across the datacenter

This τ spans ~12 orders of magnitude in time (picoseconds to seconds).

Generational scaling:

τn+1=τnα\tau_{n+1} = \frac{\tau_n}{\alpha}

The scaling factor α is workload-dependent — not universal:

Workload Typeα (Annual Scaling Factor)
Power-constrained mobile~1.3×
Safety-critical autonomous~1.5×
AI training and inference~10×

For AI — where throughput equals revenue — the τ Law enables 10× annual improvement. Far beyond what geometry alone could deliver.

2.3 Why τ Works as a Unified Metric

From He Tingbo’s ISCAS paper “A Time Scaling Theory for Multi-Layer Electronic Systems”:

“Frequency, latency, bandwidth, and throughput — at every level, these are governed by τ. Process technicians, circuit designers, and system architects can discuss the same quantity using the same units.”

One metric across four layers. That is the key. Previously, each discipline optimized local metrics that didn’t compose.

2.4 The Four-Layer Co-Optimization Stack

flowchart TB
    subgraph System["System Layer"]
        direction TB
        UB["UnifiedBus 灵衢总线<br/>Unified Memory Addressing<br/>Native Memory Semantics"]
        NET["Hi-ONE Optical Interconnect<br/>100–200m reach<br/>~500× latency reduction"]
    end

    subgraph Chip["Chip Layer"]
        direction TB
        SW["Software-Architecture-Silicon<br/>Full-Stack Co-Design"]
        ARCH["Workload-Driven Pipeline<br/>Fine-Grained Data Flow Control"]
    end

    subgraph Circuit["Circuit Layer"]
        direction TB
        LF["LogicFolding<br/>3D Vertical Integration"]
        RC["RC Optimization<br/>Low-κ Dielectrics"]
    end

    subgraph Device["Device Layer"]
        direction TB
        TR["Transistor Engineering<br/>GAA / Strain / High-κ Metal Gate"]
        PAR["Parasitic R & C Reduction<br/>Interconnect Optimization"]
    end

    Device --> Circuit --> Chip --> System

    style System fill:#e1f5fe
    style Chip fill:#f3e5f5
    style Circuit fill:#e8f5e9
    style Device fill:#fff3e0
LayerOptimization TargetKey Techniques
DeviceMinimize τ_transistorMobility enhancement, strain engineering, GAA, parasitic R/C reduction
CircuitMinimize RC delayLogicFolding (3D stacking), low-κ dielectrics, shorter critical-path wiring
ChipMinimize compute + memory τSoftware-architecture-silicon co-design, workload-driven pipeline
SystemMinimize end-to-end message τUnifiedBus (灵衢), optical interconnects, unified memory addressing

3. LogicFolding: 3D Without EUV

3.1 From Suburbs to Skyscrapers

LogicFolding is the crown jewel. It transforms how circuits are laid out.

Traditional 2D: all components on a flat plane. Signals travel long lateral distances. Congestion on critical paths. Power wasted shuttling data across the die.

LogicFolding: stacks planar circuits vertically. Like swapping a single-story suburb for a high-rise with express elevators. Signals travel shorter distances. Lower resistive and capacitive loads. Faster τ.

graph LR
    subgraph Traditional["Traditional 2D Layout"]
        direction LR
        A["Block A<br/>(top-left)"] ---|"Long wire<br/>High R, High C<br/>Slow τ"| B["Block B<br/>(bottom-right)"]
    end

    subgraph LogicFolding["LogicFolding 3D Layout"]
        direction TB
        A2["Block A<br/>(Layer 1)"]
        B2["Block B<br/>(Layer 2)"]
        A2 -.->|"Short via<br/>Low R, Low C<br/>Fast τ"| B2
    end

    style Traditional fill:#ffebee
    style LogicFolding fill:#e8f5e9

3.2 Kirin 2026: First Proof

Huawei demonstrated LogicFolding in the upcoming Kirin 2026 mobile processor:

MetricKirin 2025 (2D)Kirin 2026 (LogicFolding)Improvement
Transistor Density155 MTr/mm²238 MTr/mm²+53.5%
Performance Core Freq~2.6 GHz3.1 GHz+19%
Energy EfficiencyBaseline+41%+41%
ProcessSMIC 7nmSMIC 7nm (same node)

Same fab. Same node. 53.5% density gain. That is three years of traditional geometric scaling in one step — achieved through architecture alone.

3.3 Kirin Roadmap to 2031

timeline
    title Kirin Chip Roadmap Under the τ Law
    2026 (Fall) : Kirin 2026 debuts LogicFolding : 3.10 GHz, 238 MTr/mm² : First 2-layer folding
    2027 : Kirin 2027 : 3.39 GHz, enhanced folding
    2028 : Kirin 2028 : 3.71 GHz, multi-layer folding
    2029 : Kirin 2029 : >4.00 GHz, full-scale 3D
    2031 : Target: 1.4nm-equivalent density : ~600+ MTr/mm² projected

By 2031, Huawei projects density equivalent to a 1.4nm process — achieved through architectural innovation, not lithographic shrinkage.


4. Ascend 910C/910D vs. Nvidia H100

τ Law is the long game. The near-term offensive is shipping now.

4.1 Specifications

SpecificationAscend 910CNvidia H100 SXMNvidia H20 (China)
Process NodeSMIC 7nm N+2TSMC 4N (5nm)TSMC 4N (5nm)
Transistors53 billion~80 billion~80 billion
ArchitectureDa Vinci (dual-die)HopperHopper
FP16/BF16~752 TFLOPS989 TFLOPS296 TFLOPS
FP81,504 TFLOPS1,979 TFLOPS592 TFLOPS
INT81,504 TOPS3,958 TOPS592 TOPS
Memory128 GB HBM2e80 GB HBM396 GB HBM3
Memory Bandwidth3.2 TB/s3.35 TB/s4.0 TB/s
TDP~310–500W700W400W
InterconnectHCCS (392 GB/s)NVLink 4 (900 GB/s)NVLink 4 (900 GB/s)
vs. H100~76–81%100% (baseline)~30%
Chip Logic Area~1.6× H100BaselineBaseline
Domestic Content>90%N/AN/A
Unit Price (Est.)~$2,500–3,000~$25,000–30,000~$12,000–15,000

4.2 Where 910C Wins, Where It Lags

Wins:

  • 128 GB memory vs. H100’s 80 GB — matters for large model inference
  • Cost: roughly 10× cheaper
  • Software-hardware co-optimization: CANN framework + CloudMatrix super nodes push inference efficiency above raw specs

Lags:

  • Architecture efficiency: logic die area ~60% larger than H100 for similar performance
  • Memory bandwidth: slightly behind (3.2 vs. 3.35 TB/s) — bottleneck for training
  • Ecosystem: CANN/CUNN vs. CUDA — significant gap in tooling and libraries
  • Training workloads: less optimized for sustained training

4.3 CloudMatrix 384: Super Node

graph TB
    subgraph CM["CloudMatrix 384 Super Node"]
        direction TB
        subgraph NPUs["Compute Layer (384× Ascend 910C)"]
            NPU1["NPU 1"]
            NPU2["NPU 2"]
            NPU3["..."]
            NPU4["NPU 384"]
        end

        subgraph Network["Three-Plane Network Architecture"]
            UB["UB Plane<br/>Scale-Up All-to-All<br/>392 GB/s per NPU"]
            RDMA["RDMA Plane<br/>Scale-Out RoCE<br/>200 Gbps per NPU"]
            VPC["VPC Plane<br/>Management & Storage"]
        end

        subgraph CPU["Kunpeng CPU Layer"]
            CPU1["Kunpeng 920"]
        end
    end

    NPUs --> UB
    NPUs --> RDMA
    NPUs --> VPC
    CPU1 --> UB

    style CM fill:#e3f2fd
    style Network fill:#f1f8e9

CloudMatrix 384 — 384 Ascend 910C NPUs — delivers:

  • Prefill throughput: 6,688 tokens/s per NPU
  • Decode throughput: 1,943 tokens/s per NPU (<50ms TPOT)
  • Compute efficiency: 4.45 tok/s/TFLOPS prefill, 1.29 tok/s/TFLOPS decode

These efficiency numbers exceed optimized H100 deployments (3.75 and 1.10). Full-stack co-optimization at work.

4.4 Ascend 910D: Going for the Lead

SpecificationAscend 910D (Projected)Nvidia H100Nvidia B200
ProcessSMIC 7nm N+2 (enhanced)TSMC 5nmTSMC 4nm
FP161,000+ TFLOPS989 TFLOPS~2,250 TFLOPS
Memory192 GB HBM380 GB HBM3192 GB HBM3e
TDP~350–450W700W1,000W
TargetSurpass H100BaselineNext-gen

910D in sampling with ByteDance, Baidu, Alibaba, and China Mobile. Mass production expected late 2025.

AI server racks in data center


5. The Geopolitical Layer: Sanctions vs. Resilience

5.1 Escalation Timeline

timeline
    title US-China Chip Sanctions Timeline
    2019 : Huawei added to Entity List : TSMC cut-off begins
    2020 : SMIC added to Entity List : EUV equipment blocked
    2022 : CHIPS Act passed : October 7 export controls
    2023 : Japan/Netherlands join restrictions : More equipment blocked
    2024 : H20/A800 China-custom chips banned : Nvidia loses $5.5B
    2025 Jan : Biden AI Diffusion Rule (revoked May)
    2025 May 13 : BIS warns against using Ascend chips "anywhere" : Threatens criminal penalties

On May 13, 2025, BIS issued unprecedented guidance:

“The use of Huawei’s Ascend processors (910B, 910C, 910D) anywhere in the world without a license constitutes a violation of US export controls.”

Extraterritorial jurisdiction over any use of Huawei AI chips globally.

5.2 Huawei’s Sanctions-Proof Supply Chain

ComponentDomestic SupplierStatus
Chip DesignHuawei HiSilicon100%
Foundry (7nm)SMICActive production
Advanced PackagingJCET / Tongfu Micro>80%
HBM MemoryCXMT / YMTC (HBM2e)In development
EDA ToolsHuawei + domestic EDA~40%
PhotoresistJSR China / domesticMaturing
AI FrameworkCANN / MindSporeFunctional CUDA alternative

Key numbers:

  • 90%+ chip localization for Ascend 910C
  • 381 chips designed under τ principles over 6 years
  • SMIC 7nm N+2 yields: ~20% (2024) → 40–50% (2025)
  • Monthly production: ~2.6K wafers for Ascend

5.3 Stakeholder Map

graph TB
    subgraph US["United States"]
        BIS["BIS / Commerce Dept"]
        Nvidia["Nvidia"]
        AMD["AMD"]
        Intel["Intel"]
    end

    subgraph China["China"]
        Huawei["Huawei / HiSilicon"]
        SMIC["SMIC"]
        CXMT["CXMT / YMTC"]
        DeepSeek["DeepSeek / ByteDance / Baidu"]
    end

    subgraph Allies["US Allies"]
        TSMC["TSMC (Taiwan)"]
        ASML["ASML (Netherlands)"]
        Samsung["Samsung (Korea)"]
        Tokyo["Tokyo Electron (Japan)"]
    end

    BIS -->|"Export Controls"| Huawei
    BIS -->|"Equipment Bans"| SMIC
    Nvidia -->|"H100/H200/B200"| TSMC
    Huawei -->|"Chip Orders"| SMIC
    SMIC -->|"7nm Production"| Huawei
    DeepSeek -->|"AI Inference Demand"| Huawei
    ASML -->|"EUV Equipment"| TSMC
    ASML -.->|"Blocked"| SMIC
    TSMC -.->|"Cut Off"| Huawei

    style Huawei fill:#ffebee
    style SMIC fill:#fff3e0
    style BIS fill:#e3f2fd

6. UnifiedBus (灵衢): One Protocol for the Datacenter

A critical but under-discussed piece of the τ Law: UnifiedBus.

6.1 The Tower of Babel Problem

Current datacenter interconnects are a patchwork:

  • PCIe for chip-to-chip
  • NVLink/CXL for GPU memory pooling
  • InfiniBand/RoCE for server-to-server
  • Ethernet for management

Each translation adds 500–1000× overhead over raw wire delay.

6.2 One Stack

UnifiedBus replaces the patchwork with a single protocol spanning on-chip buses to inter-rack optical links:

FeatureTraditionalUnifiedBus
Protocol StackMultiple (PCIe + NVLink + IB + Eth)Single unified stack
Memory ModelDMA-based, driver-mediatedNative memory semantics
Latency (rack-to-rack)~10–50 μs~1–5 μs
Physical ReachCopper: ~2mOptical: 100–200m
Resource ModelFixed allocationFull poolization
FailoverSecondsSub-second
graph LR
    subgraph Traditional["Traditional Multi-Protocol Stack"]
        direction TB
        APP1["Application"]
        DRV1["Drivers"]
        PCIe["PCIe Layer"]
        NVLink["NVLink Layer"]
        IB["InfiniBand"]
        ETH["Ethernet"]
        APP1 --> DRV1 --> PCIe
        DRV1 --> NVLink
        DRV1 --> IB
        DRV1 --> ETH
    end

    subgraph UB["UnifiedBus Single Stack"]
        direction TB
        APP2["Application"]
        UBL["UnifiedBus Layer"]
        PHY["Universal Physical Layer<br/>(Copper + Optical)"]
        APP2 --> UBL --> PHY
    end

    style Traditional fill:#ffebee
    style UB fill:#e8f5e9

300+ Atlas 900 super nodes shipped on UnifiedBus 1.0 since March 2025. UnifiedBus 2.0 specification is open-sourced.


7. Market Impact

7.1 Stock Moves (May 26, 2026)

CompanyChange
SMIC+17–19%
Hua Hong Semiconductor+20%
JCET+12%
Naura Technology+15%
Nvidia-2.3%

7.2 What Analysts Are Saying

Futurum Group (optimistic):

“The Tau Scaling Law and LogicFolding mark China’s most ambitious attempt yet to redefine semiconductor progress on its own terms.”

Omdia / The Register (skeptical):

“Huawei’s claims are more branding than breakthrough. LogicFolding is a design innovation, but making chips that perform at a certain level and actually building millions at acceptable yield are different problems.”

虎嗅 / Huxiu (balanced):

“The Tau Law isn’t凭空出现的. From Nvidia to TSMC, from AMD to SK Hynix, the entire industry has been exploring this direction for a decade. Huawei’s contribution is formalizing this exploration into a clear framework — the first such systematic principle from a Chinese company.”

7.3 Competitive Landscape

quadrantChart
    title AI Chip Competitive Landscape (2026)
    x-axis Low Ecosystem Maturity --> High Ecosystem Maturity
    y-axis Low Raw Performance --> High Raw Performance
    quadrant-1 Niche Players
    quadrant-2 Market Leaders
    quadrant-3 Emerging Challengers
    quadrant-4 Performance Specialists
    "Nvidia H100/B200": [0.95, 0.95]
    "Nvidia H20": [0.90, 0.30]
    "Huawei Ascend 910C": [0.35, 0.75]
    "Huawei Ascend 910D": [0.40, 0.90]
    "AMD MI300X": [0.70, 0.85]
    "Intel Gaudi 3": [0.60, 0.70]
    "Google TPU v5": [0.55, 0.80]
    "Amazon Trainium2": [0.50, 0.65]

8. The DeepSeek Connection

DeepSeek — the Chinese AI lab whose R1 and V3 models disrupted global LLM economics — runs significant inference capacity on Huawei’s CloudMatrix.

8.1 Inference Economics

MetricDeepSeek on Ascend 910CDeepSeek on Nvidia H800
Inference cost (V3)~1 CNY / 1M tokens~7 CNY / 1M tokens
Inference cost (R1)~4 CNY / 1M tokens~20+ CNY / 1M tokens
Prefill efficiency4.45 tok/s/TFLOPS3.96 tok/s/TFLOPS
Decode efficiency1.29 tok/s/TFLOPS1.17 tok/s/TFLOPS

10× cost advantage for inference. When software is co-optimized for hardware — CANN, CUNN kernels, custom operators — the effective gap narrows dramatically.

8.2 Full-Stack Synergy

flowchart LR
    subgraph HW["Huawei Hardware Stack"]
        A["Ascend 910C/910D<br/>NPU"]
        B["CloudMatrix 384<br/>Super Node"]
        C["UnifiedBus<br/>Interconnect"]
    end

    subgraph SW["Software Stack"]
        D["CANN / CUNN<br/>CUDA Alternative"]
        E["MindSpore / PyTorch<br/>Framework"]
        F["DeepSeek R1/V3<br/>Optimized Models"]
    end

    subgraph Market["Market Impact"]
        G["1 CNY / 1M tokens<br/>V3 Inference"]
        H["90% Cost Reduction<br/>vs. Nvidia Cloud"]
        I["20,000+ Developers<br/>in Ecosystem"]
    end

    A --> B --> C
    D --> E --> F
    HW --> SW --> Market

    style HW fill:#e3f2fd
    style SW fill:#e8f5e9
    style Market fill:#fff3e0

9. Critical Assessment: What’s Real, What’s Projection

ClaimEvidence StatusAssessment
τ Law frameworkPublished at IEEE ISCASPeer-reviewed; solid foundation
381 chips mass-producedHuawei disclosurePlausible; multiple product lines
LogicFolding 53.5% density gainKirin 2026 dataUnverified; fall 2026 launch will validate
1.4nm-equivalent by 2031ProjectionAmbitious; depends on multi-layer folding
Ascend 910C at 80% of H100Independent estimatesAnalyst consensus; validated by DeepSeek
CloudMatrix efficiency > H100Published benchmarksCompetitive for MoE inference; training gap remains

Key Risks

  1. Manufacturing: SMIC 7nm yields (40–50%) far below TSMC (>80%). Without EUV, pushing below 7nm is brutal economics.

  2. Memory bottleneck: HBM3/HBM3e near-impossible to source under sanctions. CXMT domestic HBM still early-stage.

  3. Ecosystem gap: CANN/CUNN is functional. Not CUDA. The “one-line import” migration promise is optimistic for complex models.

  4. Die area: Ascend 910C chip area ~60% larger than H100. Architecture is less efficient per transistor.

  5. Market access: US sanctions limit Ascend to China + friendly markets (Middle East, Russia, parts of SE Asia).


10. Where This Goes: Five Scenarios to 2030

  1. Convergence: Huawei catches up through domestic EUV or sanctions easing. Gap closes to <1 generation.

  2. Sustained Bifurcation: Two parallel ecosystems. China dominates domestic + Belt & Road. West holds premium global market.

  3. Western Pull-Ahead: TSMC hits 1nm with GAA/CFET. Architecture can’t compensate. Huawei falls 3+ generations behind.

  4. Paradigm Shift: τ Law principles gain industry-wide adoption. Architectural innovation becomes primary lever. Process node matters less.

  5. Full Decoupling: Complete split. China achieves self-sufficiency at cost of 5–10 year delay. Global innovation slows.


11. A Rule-Maker, Not a Follower

The τ Law is more than a technical paper:

  • Scientific contribution: peer-reviewed framework for post-Moore optimization
  • Engineering strategy: 381 commercial chips already produced under its principles
  • Geopolitical signal: US sanctions catalyzed rather than crippled Chinese semiconductor innovation
  • Industry invitation: UnifiedBus 2.0 is open-sourced

The Ascend 910C — ~80% of H100 performance at ~10% of the cost — proves architectural ingenuity can compensate for process node disadvantage. The 910D aims to close the gap entirely.

Answers we get over the next five years will determine whether the τ Law rivals Moore’s Law in historical significance:

  • Can SMIC hit 70%+ yields at 7nm and push into 5nm?
  • Will Kirin 2026 deliver on LogicFolding this fall?
  • Can CANN close the ecosystem gap with CUDA?
  • Will the 1.4nm-equivalent target for 2031 be achieved?

One thing is already clear: Huawei has shifted from 追赶者 (follower) to 规则制定者 (rule-maker).

As He Tingbo said at ISCAS 2026:

“We believe that openness and collaboration are key to driving ongoing progress in the semiconductor industry. No single company can independently find all the answers along the path of semiconductor evolution.”

The τ Law is Huawei’s answer. The rest of the industry now decides whether to engage with the question.


Appendix A: Key Formulas

Time Constant Decomposition

τtotal=τtransistor2+τcircuit2+τchip2+τsystem2\tau_{\text{total}} = \sqrt{\tau_{\text{transistor}}^2 + \tau_{\text{circuit}}^2 + \tau_{\text{chip}}^2 + \tau_{\text{system}}^2}

Circuit-level τ:

τcircuit=RwireCtotal=ρLA(ϵoxAtox+Cparasitic)\tau_{\text{circuit}} = R_{\text{wire}} \cdot C_{\text{total}} = \frac{\rho \cdot L}{A} \cdot \left(\epsilon_{\text{ox}} \cdot \frac{A}{t_{\text{ox}}} + C_{\text{parasitic}}\right)

LogicFolding reduces $L$ (wire length) by 50–90%, directly decreasing $\tau_{\text{circuit}}$.

Transistor Density Equivalence

ρeffective=ρphysical×(1+i=1nfiηi)\rho_{\text{effective}} = \rho_{\text{physical}} \times \left(1 + \sum_{i=1}^{n} f_i \cdot \eta_i\right)

For Kirin 2026 ($n=2$, $f=0.55$, $\eta=0.95$):

ρeffective=155×(1+0.55×0.95)238 MTr/mm2\rho_{\text{effective}} = 155 \times (1 + 0.55 \times 0.95) \approx 238 \text{ MTr/mm}^2

AI Training Efficiency

TtrainingNparamsDtokensPcomputeηutilizationT_{\text{training}} \propto \frac{N_{\text{params}} \cdot D_{\text{tokens}}}{P_{\text{compute}} \cdot \eta_{\text{utilization}}}

Huawei targets $\eta_{\text{utilization}}$ — achieving >90% on CloudMatrix for MoE vs. industry average 40–60%.


Appendix B: Glossary

TermDefinition
τ (tau)Time constant — characteristic time for signal propagation through an electronic system
LogicFolding3D chip architecture stacking circuit layers vertically to shorten signal paths
UnifiedBus (灵衢)Unified datacenter interconnect protocol replacing PCIe/NVLink/InfiniBand
CANNCompute Architecture for Neural Networks — Huawei’s AI software stack
CUNNCUDA-to-CANN migration layer for PyTorch models on Ascend
CloudMatrixHuawei’s AI supercomputer architecture using Ascend NPUs
SMIC N+2SMIC’s 7nm-class process using DUV lithography
HBMHigh Bandwidth Memory — 3D-stacked DRAM for AI accelerators
MoEMixture of Experts — neural network architecture using conditional computation
EUVExtreme Ultraviolet lithography — most advanced chip patterning technology

References

  1. He Tingbo, “A Time Scaling Theory for Multi-Layer Electronic Systems,” IEEE ISCAS 2026, Shanghai.
  2. Huawei Official Newsroom, “Huawei Announces Tau (τ) Scaling Law,” May 25, 2026.
  3. Xinhua News Agency, “Huawei Unveils New Chip Design Approach,” May 26, 2026.
  4. DeepSeek / Huawei Cloud, “Serving Large Language Models on Huawei CloudMatrix384,” 2025.
  5. Morgan Stanley Research, “SMIC Advanced Node Yield Analysis,” September 2025.
  6. US Bureau of Industry and Security, “Export Control Guidance on PRC Advanced Computing ICs,” May 13, 2025.
  7. Hot Chips 31, “Huawei Da Vinci Architecture Deep Dive,” 2019.
  8. Wall Street Journal, “Huawei Tests Ascend 910D as Nvidia Alternative,” April 2025.
  9. 21st Century Business Herald, “Huawei Tau Law Analysis,” May 25, 2026.
  10. Futurum Group Research, “Does Huawei’s Tau Scaling Law Challenge Logic Leadership?” May 26, 2026.

Compiled from IEEE publications, Huawei official disclosures, Xinhua reports, financial analyst research, and technical documentation. Performance figures are best-available estimates; actual results vary by deployment.

Last updated: May 28, 2026

Share this page