Cache memory - Colonne du cercle informatique

Compositionstructure

La mémoire tampon à haut débit est la mémoire de premier niveau existant entre la mémoire principale et le CPU. Elle est composée de puces de mémoire statique (SRAM). La capacité est relativement faible mais la vitesse est supérieure à celle de la mémoire principale.

Il est principalement composé de trois parties :

Corps de la mémoire cache : stocke les instructions et les blocs de données transférés depuis la mémoire principale.

Composant de conversion d'adresse :Créerunetablerépertoirepourréaliserlaconversiondel'adressemémoireprincipaleverscacadresse.

Remplacer les composants : lorsque le cache est plein, remplacez le bloc de données selon une certaine stratégie et modifiez le composant de traduction de l'en-tête.

Principe de fonctionnement

Cachememoryisusuallycomposedofhigh-speedmemory, Lenovomemory, replacementlogiccircuitandcorrespondingcontrolcircuit.Inacomputersystemwithacachememory, theaddressofthecentralprocessortoaccessthemainmemoryisdividedintothreefields: ROWNUMBER, columnNumber, andaddressinthegroup.Therefore, themainmemoryislogicallydividedintoseveralrows, eachrowisdividedintoseveralmemorycellgroups, eachgroupcontainsseveralordozensofwords.High-speedmemoryisalsodividedintorowsandcolumnsofmemorycellgroupsaccordingly.Bothhavethesamenumberofcolumnsandthesamegroupsize, butthenumberofrowsinthehigh-speedmemoryismuchsmallerthanthatinthemainmemory.

La mémoire Lenovo est utilisée pour l'association d'adresses.

Whenthecentralprocessingunitaccessesthemainmemory, thehardwarefirstautomaticallydecodesthecolumnnumberfieldoftheaccessaddress, soastocomparealltherownumbersoftheLenovomemorywiththerowofthemainmemoryaddress.Thenumberfieldiscompared: iftherearethesame, itindicatesthatthemainmemoryunittobeaccessedisalreadyinthehigh-speedmemory, whichiscalledahit.Thehardwaremapstheaddressofthemainmemorytotheaddressofthehigh-speedmemoryandexecutestheaccessoperation, Iftheyarenotthesame, itmeansthattheunitisnotinthehigh-speedmemory, whichiscalledoff-target.Thehardwarewillperformtheoperationofaccessingthemainmemoryandautomaticallytransferthemainmemorycellgroupwheretheunitislocatedintotheemptymemorycellgroupinthesamecolumnofthehigh-speedmemory.Atthesametime, therownumberofthegroupinthemainmemoryisstoredintheunitofthecorrespondinglocationoftheLenovomemory.

Whenanoff-targetoccursandthereisnoemptypositioninthecorrespondingcolumnofthehigh-speedmemory, acertaingroupinthecolumniseliminatedtomakeroomforthenewlytransferredgroup, whichiscalledreplacement.Therulesfordeterminingreplacementarecalledreplacementalgorithms.Commonlyusedreplacementalgorithmsinclude: leastrecentlyusedalgorithm (LRU), la première infirst-out (FIFO), randommethod (RAND), andsoon.Thereplacementlogiccircuitperformsthisfunction.Inaddition, whenperformingawriteoperationtothemainmemory, inordertomaintaintheconsistencyofthecontentsofthemainmemoryandthehigh-speedmemory, hitsandmissesmustbehandledseparately.

Hiérarchie de stockage

Themain-auxiliarystoragestoragehierarchyisduetothemainmemorycapacityofthecomputerrelativetothecapacityrequiredbytheprogrammerItisalwaystoosmall.Thetransferofprogramsanddatafromtheauxiliarystoragetothemainmemoryisarrangedbytheprogrammer.Theprogrammermustspendalotofenergyandtimetodividethelargeprogramintoblocksinadvance, anddeterminetheblocksintheauxiliarystorage.Thelocationandtheaddressloadedintothemainmemory, andhowandwheneachblockiscalledinandoutwhentheprogramisrunning.Therefore, thereisaproblemofmemoryspaceallocation.Theformationanddevelopmentoftheoperatingsystemallowsprogrammerstogetridoftheaddresspositioningbetweenthemainandauxiliarymemoryasmuchaspossible, andatthesametimeformsthe "auxiliaryhardware" thatsupportsthesefunctions.Throughthecombinationofsoftwareandhardware, themainmemoryandauxiliarymemoryareunifiedintoAwhole, asshowninthepicture.Atthistime, themainstorageandauxiliarystorageformastoragehierarchy, thatis, thestoragesystem.Onthewhole, itsspeedisclosetothespeedofmainstorage, itscapacit yisclosetothecapacityofauxiliarystorage, andtheaveragepriceperpersonisalsoclosetotheaveragepriceofcheapandslowauxiliarystorage.Thecontinuousdevelopmentandimprovementofthissystemhasgraduallyformedthenowwidelyusedvirtualstoragesystem.Inthesystem, theapplicationprogrammercanaddresstheentireprogramuniformlywiththemachineinstructionaddresscode, justastheprogrammerhasallthevirtualmemoryspacecorrespondingtothewidthoftheaddresscode.Thisspacecanbemuchlargerthantheactualspaceofthemainmemory, sothattheentireprogramcanbesaved.Thisinstructionaddresscodeiscalledvirtualaddress (virtualmemoryaddress, VirtualAddress) orlogicaladdress, anditscorrespondingstoragecapacityiscalledvirtualmemorycapacityorvirtualmemoryspace; andtheaddressofactualmainmemoryiscalledphysicaladdress, real (stockage) adresse, anditscorrespondingstoragecapacityiscalledmainstoragecapacity, realstoragecapacityorreal (principal) storagespace

principal-auxiliairestockagestockagehiérarchie

CACHE-mainstoragetorageHierarchy

Whenthevirtualaddressisusedtoaccessthemainmemory, themachineautomaticallyconvertsittotherealaddressofthemainmemorythroughauxiliarysoftwareandhardware.Checkwhetherthecontentoftheunitcorrespondingtothisaddresshasbeenloadedintothemainmemory.Ifitisinthemainmemory, accessit.Ifitisnotinthemainmemory, transfertheprogramanddatawhereitisfromtheauxiliarymemorytothemainmemoryviatheauxiliarysoftwareandhardware., Andthenvisit.Theseoperationsdonothavetobearrangedbytheprogrammer, thatistosay, itistransparenttotheapplicationprogrammer.Themain-auxiliarystoragelevelsolvesthecontradictionbetweenthelarge-capacityrequirementsofthememoryandthelow-cost.Intermsofspeed, thecomputer'smainmemoryandCPUhavekeptagapofaboutanorderofmagnitude.ObviouslythisgaplimitsthepotentialoftheCPUspeed.Inordertobridgethisgap, asinglememoryusingonlyoneprocessisnotfeasible, andfurtherresearchmustbedonefromthestructureandorganizationofthecomputersystem.Settingupahigh-speedbuffermemory (Cache ) est une méthode importante pour résoudre la vitesse d'accès. dcachememoryissetbetweentheCPUandthemainmemorytoformacache (Cache) -mainmemorylevel, andtheCacheisrequiredtobeabletokeepupwiththerequirementsoftheCPUintermsofspeed.TheaddressmappingandschedulingbetweenCache-mainmemoryabsorbsthetechnologyofthemain-auxiliarymemorystoragelayerthatappearedearlierthanit.Thedifferenceisthatbecauseofitshighspeedrequirements, itisnotrealizedbythecombinationofsoftwareandhardwarebutbyhardware.asthepictureshows.

Mappage et conversion d'adresses

Le mappage d'adresses fait référence à la correspondance entre l'adresse de certaines données dans la mémoire et l'adresse dans le tampon. Ce qui suit décrit trois méthodes de mappage d'adresses.

1.Méthode entièrement associative

Règle de mappage d'adresse : n'importe quel élément de la mémoire principale peut être mappé sur n'importe quel élément du cache

(1)Le stockage principal et le cache sont divisés en blocs de données de même taille.

(2) Un certain bloc de données de la mémoire principale peut être chargé dans n'importe quel espace du cache.

Latabledesrépertoireseststockéedanslamémoirepertinente(associée),quicomprendtroisparties:l'adressedeblocdublocdedonnéesdanslamémoireprincipale,l'adressedubloc après avoir été stockée dans le cache et le bit effectif (également appelé bit de charge).

Avantages : Le taux de hit est relativement élevé et l'utilisation de l'espace de stockage du cache est élevée.

Inconvénients : lors de l'accès à la mémoire associée, celle-ci doit être comparée à chaque fois à l'intégralité du contenu, la vitesse est lente, le coût est élevé, et donc peu d'applications.

2.Méthode d'association directe

Règle de mappage d'adresses : un bloc dans la mémoire principale ne peut être mappé qu'à un bloc spécifique dans le cache.

(1) La mémoire principale et le cache sont divisés en blocs de données de même taille.

Cache memory

(2) La capacité de la mémoire principale doit être un multiple entier de la capacité du cache.

(3) Lorsqu'un bloc d'une certaine zone de la mémoire principale est stocké dans le cache, il ne peut être stocké qu'au même emplacement dans le cache avec le même numéro de bloc.

Datablockswiththesameblocknumberineachareaof themainmemorycanbetransferredtotheaddresswiththesameblocknumberinthecache, butonlyoneblockcanbestoredinthecacheatthesametime.Sincethemainandcacheblocknumbersarethesame, onlytheareacodeofthetransferredblockcanberecordedduringdirectoryregistration.Thetwofieldsofthemainandcacheblocknumbersandtheaddressintheblockareexactlythesame.Thedirectorytableisstoredinhigh-speedandsmall-capacitymemory, whichincludestwoparts: theareanumberofthedatablockinthemainmemoryandtheeffectivebit.Thecapacityofthedirectorytableisthesameasthenumberofcachedblocks.

Avantages : La méthode de mappage d'adresses est simple. Lors de l'accès aux données, vous n'avez qu'à vérifier si l'indicatif régional est identique, afin d'obtenir une vitesse d'accès plus rapide et le périphérique matériel est simple.

Inconvénients : les opérations de remplacement sont fréquentes et le taux d'incident est relativement faible.

3.Méthode de cartographie associative de groupe

La règle de mappage associatif de groupe :

(1) La mémoire principale et le cache sont divisés en blocs de même taille.

(2) La mémoire principale et le cache sont divisés en groupes de même taille.

(3) La capacité de la mémoire principale est un multiple entier de la capacité du cache. L'espace mémoire principal est divisé en zones en fonction de la taille de la zone tampon.

(4) Lorsque les données de la mémoire principale sont chargées dans le cache, les numéros de groupe de la mémoire principale et des caches doivent être égaux, c'est-à-dire qu'un bloc dans chaque zone ne peut être stocké que dans l'espace du même numéro de groupe dans le cache, mais chaque adresse de bloc dans le groupe peut être stockée de manière arbitraire.

Laconversionentrel'adressemémoireprincipaleetl'adressecaccomportedeuxparties.L'adressedegroupeestaccessibleselonlaméthodedemappingdirect,etl'adressedeblocestaccessibleselonlecontenu.

Avantages : la probabilité de collision du bloc est relativement faible, le taux d'utilisation du bloc est considérablement amélioré et le taux de défaillance du bloc est considérablement réduit.

Inconvénients : La difficulté et le coût de mise en œuvre sont supérieurs à ceux de la cartographie directe.

Stratégie de remplacement

1. Selon la loi de la localité du programme, on peut savoir que le programme utilise toujours ces instructions et données qui ont été utilisées récemment lorsqu'il est en cours d'exécution. Cela fournit une base théorique pour la stratégie de remplacement.

(1).Méthode aléatoire (méthode RAND)

Laméthodealéatoiredétermineau hasardleblocmémoirederemplacement.Configurezungénérateurdenombrealéatoireetdéterminezleblocderemplacementbasésurlenombrealéatoiregénéré.

(2). Méthode du premier entré, premier sorti (méthode FIFO)

Lorsque le bloc transféré en premier et frappé plusieurs fois est susceptible d'être remplacé en premier, il n'est pas conforme à la loi de localité.

(3). La méthode la moins récemment utilisée (méthode LRU)

La méthode LRU est basée sur l'utilisation de chaque bloc, choisissez toujours la méthode la moins récemment utilisée. Le bloc est remplacé. Cette méthode reflète mieux la loi de localité du programme.

2Inamulti-bodyparallelstoragesystem, becausetheI / OdevicerequestsahigherlevelofmemoryfromthemainmemorythantheCPUfetches, thiscausesthephenomenonthattheCPUwaitsfortheI / Odevicetofetchthememory, causingtheCPUWaitingforaperiodoftime, orevenwaitingforafewmainmemorycycles, reducestheefficiencyoftheCPU.InordertoavoidtheCPUandtheI / Odevicefromcompetingformemoryaccess, AFIRST-levelcachecanbeaddedbetweentheCPUandthemainmemory.Inthisway, themainmemorycansendtheinformationthattheCPUneedstothecacheinadvance.OncethemainmemoryandtheI / OdeviceexistDuringtheexchange, theCPUcandirectlyreadtherequiredinformationfromthecache, withouthavingtowaittoaffectefficiency.

3Les algorithmes actuellement proposés peuvent être répartis dans les trois catégories suivantes (la première catégorie est la clé à maîtriser) :

(1) Traditionalreplacementalgorithmanditsdirectevolution, itsrepresentativealgorithmThereare: ①LRU (LeastRecentlyUsed) algorithme: replacetheleastrecentlyusedcontentoutofCache; algorithme ②LFU (LeaseFrequentlyUsed): replacetheleastaccessedcontentoutofCache; ③IfallcontentinCacheiscachedonthesamedayIfyes, replacethelargestdocumentoutofCache, otherwisereplaceitaccordingtotheLRUalgorithm.④FIFO (FirstInFirstOut): Followthefirst-in-first-outprinciple.IfthecurrentCacheisfull, replacetheonethatenteredtheCacheearliest.

(2) Thereplacementalgorithmbasedonthekeyfeaturesofthecachecontent, itsrepresentativealgorithmsare: ①Sizereplacementalgorithm: replacethelargestcontentoutofCache②LRU-MINreplacementalgorithm: thisalgorithmstrivestomakethenumberofdocumentsreplacedleast.SupposethesizeofthedocumenttobecachedisS, andreplacethedocumentwithasizeofatleastSintheCacheaccordingtotheLRUalgorithm; ifthereisnoobjectwithasizeofatleastS, followtheLRUalgorithmfromthedocumentwithasizeofatleastS / 2Replace; ③LRU-Thresholdreplacementalgorithm: SameastheLRUalgorithm, exceptthatdocumentswhosesizeexceedsacertainthresholdcannotbecached; ④LowestLacencyFirstreplacementalgorithm: ReplacethedocumentswiththeleastaccessdelayoutofCache.

(3) Coût-basedreplacementalgorithm, thistypeofalgorithmusesacostfunctiontoevaluatetheobjectsintheCache, andfinallydeterminesthereplacementobjectbasedonthevalueofthecost.Itsrepresentativealgorithmsare: ①Hybridalgorithm: thealgorithmassignsautilityfunctiontoeachobjectintheCache, andreplacestheobjectwiththeleastutilityoutoftheCache; ②LowestRelativeValuealgorithm: replacestheobjectwiththelowestutilityvalueoutoftheCache; algorithme ③LeastNormalizedCostReplacement (LCNR): Thisalgorithmusesaninferencefunctionaboutdocumentaccessfrequency, transmissiontime, andsizetodeterminethereplacementdocument; ④Bolotetal.proposedaweightedinferencefunctionbasedondocumenttransmissiontimecost, la taille, andlastaccesstime.Determinethedocumentreplacement; ⑤Size-AdjustLRU (SLRU )Algorithme :triezlesobjetsencachéselonleratiodecoûtàdimensionner,etsélectionnezl'objetaveclepluspetitrapportderemplacement.

Présentation de la fonction

Duringthedevelopmentofcomputertechnology, theaccessspeedofthemainmemoryhasalwaysbeenmuchslowerthantheoperationspeedofthecentralprocessingunit, sothatthehigh-speedprocessingcapacityofthecentralprocessingunitcannotbefullyutilized., Theworkefficiencyoftheentirecomputersystemisaffected.Therearemanymethodstoalleviatethecontradictionbetweenthespeedmismatchbetweenthecentralprocessingunitandthemainmemory, suchastheuseofmultiplegeneral-purposeregisters, multi-bankinterleaving, etc.Theuseofcachememoryatthestoragelevelisalsooneofthecommonlyusedmethods.Manylargeandmedium-sizedcomputersaswellassomerecentminicomputersandmicrocomputersalsousehigh-speedbuffermemory.

Thecapacityofthecachememoryisgenerallyonlyafewpercentofthemainmemory, butitsaccessspeedcanmatchthecentralprocessingunit.Accordingtotheprincipleofprogramlocality, thereisahighprobabilitythatthoseunitsadjacenttoacertainunitofthemainmemorythatarebeingusedwillbeused.Therefore, whenthecentralprocessingunitaccessesacertainunitofthemainmemory, thecomputerhardwareautomaticallytransfersthecontentsofthegroupofunitsincludingtheunitintothecachememory, andthemainmemoryunitthatthecentralprocessingunitwillaccessisverylikelyJustinthegroupofcellsthatwerejusttransferredtothecachememory.Thus, thecentralprocessingunitcandirectlyaccessthecachememory.Intheentireprocessingprocess, ifmostoftheoperationsofthecentralprocessingunittoaccessthemainmemorycanbereplacedbyaccesstothecachememory, theprocessingspeedofthecomputersystemcanbesignificantlyimproved.

Réhydrater

WhentheCPUfindsusefuldataintheCache, itiscalledahit.WhenthereisnodataneededbytheCPUintheCache (thisiscalledamiss), TheCPUonlyaccessesthememory.Intheory, inaCPUwithlevel2Cache, thehitrateforreadingL1Cacheis80% .Thatistosay, theusefuldatafoundbytheCPUfromL1Cacheaccountsfor80% ofthetotaldata, andtheremaining20% isreadfromL2Cache.Duetotheinabilitytoaccuratelypredictthedatatobeexecuted, thehitrateofreadingL2isalsoabout80% (readingusefuldatafromL2accountsfor16% ofthetotaldata) .Thentheremainingdatahastobecalledfrommemory, butthisisalreadyafairlysmallpercentage.Insomehigh-endCPUs, weoftenhearL3Cache, quiestconçupourlesdonnéesmanquantesaprèslalectureducacheL2—unesortedecache.

InordertoensureahigherhitrateduringCPUaccess, thecontentintheCacheshouldbereplacedaccordingtoacertainalgorithm.Amorecommonlyusedalgorithmisthe "leastrecentlyusedalgorithm" (LRUAlgorithm), whicheliminatesthelinesthathavebeenleastvisitedinthemostrecentperiodoftime.Therefore, itisnecessarytosetacounterforeachrow.TheLRUalgorithmclearsthecounterofthehitrowandadds1tothecountersoftheotherrows.Whenreplacementisneeded, thedatarowwiththelargestrowcountercountiseliminated.Thisisanefficientandscientificalgorithm.ItscounterclearingprocesscaneliminatesomedatathatisnolongerneededafterfrequentcallsoutoftheCacheandimprovetheutilizationoftheCache.

TheimpactoftheCachereplacementalgorithmonthehitrate.WhenanewmainmemoryblockneedstobetransferredintotheCacheanditsavailablespaceisfullagain, thedataintheCacheneedstobereplaced, whichcreatesareplacementstrategy (algorithme) problem.Accordingtothelawofprogramlocality, itcanbeknownthattheprogramalwaysusestheinstructionsanddatathathavebeenusedrecentlywhentheprogramisrunning.Thisprovidesatheoreticalbasisforthereplacementstrategy.ThegoalofthereplacementalgorithmistomaketheCachegetthehighesthitrate.Cachereplacementalgorithmisanimportantfactorthataffectstheperformanceoftheproxycachesystem.AgoodCachereplacementalgorithmcanproduceahigherhitrate.Thecommonlyusedalgorithmsareasfollows: (1) Randommethod (RANDmethod) Therandomreplacementalgorithmistousearandomnumbergeneratortogenerateablocknumbertobereplaced, andreplacetheblock.Thisalgorithmissimpleandeasytoimplement.Moreover, itdoesnotconsiderthepast, presentandfutureuseoftheCacheblock, butdoesnotusethe "historicalinformation" usedbytheuppermemory ,etnesuitpas eprincipedelocalitéd'accèsàlamémoire,letauxd'accèsducachenepeutpasêtreamélioréetletauxd'accèsestfaible.

(2) First-in-first-outmethod (FIFOmethod) First-in-first-out (First-In-First-Out, FIFO) algorithm.ItistoreplacetheinformationblockthatfirstenterstheCache.TheFIFOalgorithmdeterminestheorderofeliminationaccordingtothesequenceoftransfersintotheCache, andselectstheearliestblocktransferredintotheCacheforreplacement.Itdoesnotneedtorecordtheusageofeachblock, whichisrelativelyeasytoimplementandhaslowsystemoverhead.ItsdisadvantageisthatsomeProgramblocksthatneedtobeusedfrequently (suchascyclicprograms) arealsoreplacedastheearliestblockstoentertheCache, andtheyarenotbasedontheprincipleoflocalityofmemoryaccess ,ainsiletauxdecachenepeutpasêtreamélioré.Celaestparcequelesinformationslesplusprécocespeuventêtreutiliséesdanslefutur,oufréquemmentutilisées,commeunprogrammedecycle.Cetteméthodeestsimpleetpratique.

(3) LeastRecentlyUsed (LRU) algorithm.ThismethodistoreplacetheleastrecentlyusedinformationblockintheCache.Thisalgorithmisbetterthanthefirst-infirst-outalgorithm.However, thismethoddoesnotguaranteethatitwillnotbeusedfrequentlyinthepastandwillnotbeusedinthefuture.TheLRUmethodalwaysselectstheleastrecentlyusedblocktobereplacedbasedontheusageofeachblock.Althoughthismethodbetterreflectsthelawofprogramlocality, thisreplacementmethodneedstorecordtheusageofeachblockintheCacheatanytime, inordertodeterminewhichblockistheleastrecentlyusedblock.TheLRUalgorithmisrelativelyreasonable, butitismorecomplicatedtoimplement, andthesystemoverheadisrelativelylarge.Itisusuallynecessarytosetupahardwareorsoftwaremodulecalledacounterforeachblocktorecorditsuse.